In this research project, we undertook an extensive exploration of machine learning models, employing a diverse range of techniques such as Support Vector Machines (SVM), Convolutional Neural Networks (CNN), and Transfer Learning. Our primary objective was to classify Brazilian Jiu-Jitsu images, drawing from a vast dataset comprising 120,279 labeled images, into 18 distinct positions. Impressively, our models achieved remarkable accuracies, with some reaching up to 99.5%.
《View the paper》
In our research project, my group investigated the impact of long commutes on mental health using extra sums of squares analysis. Through model analysis, we developed a reduced model with a noteworthy 76.2% adjusted R2 value, identifying significant predictors of poor mental health. Our findings suggested a potential connection between long commutes and diminished mental health when considering other factors. Won an honorable mention in the Undergraduate Statistics Class Project.
In this project, my group investigated the factors influencing song popularity and develops a predictive model using machine learning techniques. We explored various regression models, including ridge and lasso regression, random forest regression, and principal component analysis. We identified associations between song attributes and popularity levels, with notable correlations between energy and loudness, and Random Forest proves to be the most effective model.
Users can explore the trend of dozens of county-level health related variables such as % adult with obseity and % excessive drinking across the U.S., and conduct a simpple linear regression between variables.
Tech stack: R, ggplot, R Shiny
Users can analyze YouTube video comments, categorizing them into 8 emotions, visualizing them with word, and using a BERT-based Neural network model to assess sentiment.
Tech stack: Python, PyTorch, matplotlib, Flask, YouTube API
Users can explore the connections between classes and the learning outcome through various interactive visualizations such as Sankey Diagram and stacked barchart.
Tech stack: R, ggplot, R Shiny, Javascript
Users can generate multiple sets of the Chinese flashcards based on the typed paragraph utilizing NLP with the already memorized vocablaries excluded.
Tech stack: Python, tkinter, pandas, jieba
I am proficient in using SQL Server, and MySQL, and skilled in writing complex queries and stored procedures to extract and manipulate data efficiently.
I have 3 years of experience in Python, creating games and desktop apps such as Chinese-flashcard generator. I am familiar with major libraries such as pandas and flask.
I have been using R for a year mainly for conducting statistical analysis and creating interactive visualization app. I am familiar with several libraries such as plotly, ggplot, R shiny.
I have hands-on experience as a software engineer during my internship, where I developed a web application for VR trip. I can build an efficient app with my strong background in data structure and computer architecture.
I am familiar with various Machine Learning algorithms and libraries such as scikit-learn, Matplotlib, and PyTorch. My favorite project was Jiujitsu position image classifcation, where the fine-tuned model achieved 99.5% accuracy in the test data.
Statistical analysis is a my passion. The topic I worked in the past vary from econimy data, health data, to sports data. One of my projects, "Effect of Driving Alone on Mental Health" recieved honorable mention in USCLAP.
Explored a comprehensive understanding of the factors influencing package popularity and characteristics within the ecosystem.
Analyzed the 2 decades trend of college tuition and acceptance rate in the U.S. throuugh various visualizaion.