Ryu Sonoda

Data scientist, ML Engineer and Software Engineer

Education

Columbia University, MS in Data Science

Grinnell College BA, in Computer Science with honor

I am currently pursuing a Master of Science degree in Data Science as a graduate student. My enthusiasm lies in collaborating with organizations that are dedicated to harnessing the power of their data effectively. This extends from the initial concept of defining key metrics and devising precise data collection methods to the meticulous process of data preparation, in-depth analysis, and the application of machine learning techniques. My ultimate goal is to present these insights in a compelling and impactful manner, facilitating informed decision-making.

Portfolio

《View the code and paper》

Brazilian Jiu-Jitsu Image Recognition

In this research project, we undertook an extensive exploration of machine learning models, employing a diverse range of techniques such as Support Vector Machines (SVM), Convolutional Neural Networks (CNN), and Transfer Learning. Our primary objective was to classify Brazilian Jiu-Jitsu images, drawing from a vast dataset comprising 120,279 labeled images, into 18 distinct positions. Impressively, our models achieved remarkable accuracies, with some reaching up to 99.5%.

《View the paper》

Effect of Driving Alone on Poor Mental Health

In our research project, my group investigated the impact of long commutes on mental health using extra sums of squares analysis. Through model analysis, we developed a reduced model with a noteworthy 76.2% adjusted R2 value, identifying significant predictors of poor mental health. Our findings suggested a potential connection between long commutes and diminished mental health when considering other factors. Won an honorable mention in the Undergraduate Statistics Class Project.

《View the code and paper》

Effect of Driving Alone on Poor Mental Health

In this project, my group investigated the factors influencing song popularity and develops a predictive model using machine learning techniques. We explored various regression models, including ridge and lasso regression, random forest regression, and principal component analysis. We identified associations between song attributes and popularity levels, with notable correlations between energy and loudness, and Random Forest proves to be the most effective model.

《View the app》 health-app

U.S. Health Map Dashboard

Users can explore the trend of dozens of county-level health related variables such as % adult with obseity and % excessive drinking across the U.S., and conduct a simpple linear regression between variables.

Tech stack: R, ggplot, R Shiny

《View the app》 Youtube_dashboard

Youtube Comment Analysis Dashboard

Users can analyze YouTube video comments, categorizing them into 8 emotions, visualizing them with word, and using a BERT-based Neural network model to assess sentiment.

Tech stack: Python, PyTorch, matplotlib, Flask, YouTube API

《View the app》 CWLO-app

Course Assessment Dashboard

Users can explore the connections between classes and the learning outcome through various interactive visualizations such as Sankey Diagram and stacked barchart.

Tech stack: R, ggplot, R Shiny, Javascript

《View the code》 Flashcard-app

Chinese Flashcard generator

Users can generate multiple sets of the Chinese flashcards based on the typed paragraph utilizing NLP with the already memorized vocablaries excluded.

Tech stack: Python, tkinter, pandas, jieba

Work Experience

Research Assistant in DitecT Lab (Columbia University: Sep 2023 - present)

Participating to research for LLM application in Transportation through literature review and implementation of LLM

Teaching Assistant (Grinnell College: Jan 2023 - May 2023)

Mentored a cohort of 60 students during lectures and lab sessions for the Computer Organization and Architecture class taught by Professor Weinman.
Demonstrated leadership abilities by planning and leading weekly mentor sessions, delivering comprehensive problem sets, and providing guidance and assistance.

Software Engineer Intern (VoicePing: June 2022 - Aug 2022)

Built a web application with about 300 user traffic enabling users to manage virtual reality (VR) trips, leveraging NodeJS, React, and SCSS in an agile environment with 7 engineers.
Modified the REST API calls to enhance the filtering features for itinerary search by the date, genre and country of the tours.
Implemented a user interface, supporting seamless multi-language and multi-device experiences, employing frameworks such as i18n and Ant Design.

Leadership Experience

Financial development director (HLAB: Nov 2020 - Oct 2021)

Wrote applications to foundation and regional government subsidy programs to run the summer camp. Raised $25,000 dollars in total.
Mentored 10 high school students. Developed admission and career workshop, and a brief intro microeconomics course.

President of Japanese Cultural Association (Grinnell College: Aug 2020 - May 2021)

Organized an online monthly meeting and created mentor systems for freshmen to create an inclusive environment for students who could not come to the U.S. due to the pandemic.
Managed the budget (annual 1,000 dollars) of JCA by deciding how to allocate the budget.

My skills

SQL

I am proficient in using SQL Server, and MySQL, and skilled in writing complex queries and stored procedures to extract and manipulate data efficiently.

Python

I have 3 years of experience in Python, creating games and desktop apps such as Chinese-flashcard generator. I am familiar with major libraries such as pandas and flask.

R

I have been using R for a year mainly for conducting statistical analysis and creating interactive visualization app. I am familiar with several libraries such as plotly, ggplot, R shiny.

Software Engineering

I have hands-on experience as a software engineer during my internship, where I developed a web application for VR trip. I can build an efficient app with my strong background in data structure and computer architecture.

Machine
Learning

I am familiar with various Machine Learning algorithms and libraries such as scikit-learn, Matplotlib, and PyTorch. My favorite project was Jiujitsu position image classifcation, where the fine-tuned model achieved 99.5% accuracy in the test data.

Statisical
Analysis

Statistical analysis is a my passion. The topic I worked in the past vary from econimy data, health data, to sports data. One of my projects, "Effect of Driving Alone on Mental Health" recieved honorable mention in USCLAP.

Ryu Sonoda

Data scientist, ML Engineer and Software Engineer

Education

Columbia University, MS in Data Science

Grinnell College BA, in Computer Science with honor

Portfolio

Brazilian Jiu-Jitsu Image Recognition

Effect of Driving Alone on Poor Mental Health

Effect of Driving Alone on Poor Mental Health

U.S. Health Map Dashboard

Youtube Comment Analysis Dashboard

Course Assessment Dashboard

Chinese Flashcard generator

Work Experience

Research Assistant in DitecT Lab (Columbia University: Sep 2023 - present)

Teaching Assistant (Grinnell College: Jan 2023 - May 2023)

Software Engineer Intern (VoicePing: June 2022 - Aug 2022)

Leadership Experience

Financial development director (HLAB: Nov 2020 - Oct 2021)

President of Japanese Cultural Association (Grinnell College: Aug 2020 - May 2021)

My skills

SQL

Python

R

Software Engineering

Machine
Learning

Statisical
Analysis

Blog

R package ecosystem

College tuition & accpetance rate

Coming soon...

Coming soon...

Interested in hiring me? Let's have a chat!

Ryu Sonoda

Data scientist, ML Engineer and Software Engineer

Education

Columbia University, MS in Data Science

Grinnell College BA, in Computer Science with honor

Portfolio

Brazilian Jiu-Jitsu Image Recognition

Effect of Driving Alone on Poor Mental Health

Effect of Driving Alone on Poor Mental Health

U.S. Health Map Dashboard

Youtube Comment Analysis Dashboard

Course Assessment Dashboard

Chinese Flashcard generator

Work Experience

Research Assistant in DitecT Lab (Columbia University: Sep 2023 - present)

Teaching Assistant (Grinnell College: Jan 2023 - May 2023)

Software Engineer Intern (VoicePing: June 2022 - Aug 2022)

Leadership Experience

Financial development director (HLAB: Nov 2020 - Oct 2021)

President of Japanese Cultural Association (Grinnell College: Aug 2020 - May 2021)

My skills

SQL

Python

R

Software Engineering

Machine Learning

Statisical Analysis

Blog

R package ecosystem

College tuition & accpetance rate

Coming soon...

Coming soon...

Interested in hiring me? Let's have a chat!

Machine
Learning

Statisical
Analysis