
About Me
This page is a bit about my background, skills, and interests.
Here's where I am presently, and where I'm headed
- 3-Years OPT (STEM qualified), NO need of sponsorship
- After 3 years: I can bring you more profits than a lot people who don't need sponsorship
- Do you wanna contribute to my future life? Hire Me!
- Graduated with GPA 3.96/4.0
- Focus on: Data Science, Data Engineering, ML/AI, Cloud Computing, Database, Algorithms
- Certified: Tableau Data Analyst, AWS, Applied AI
Time Range
Sept. 2023 - May 2025Projects
Please view Projects PageRelevant Coursework
- Data Analytics Engineering
- Deterministic Operations Research
- Data Mining
- Data Management and Database Design
- Visualization for Analytics
- Cloud Computing
- Data Structure
- Algorithms
- Statistic
- Unified data systems via ETL pipelines and warehouse modeling.
- Built Tableau dashboards and predictive models to drive business insights.
Key Achievements:
- Streamlined data quality monitoring and reporting, cutting manual workload by 25% and improving reliability
- Audited and optimized database schemas by resolving normalization issues and improving data design
- Designed and deployed ETL pipelines to unify disparate data sources into a centralized warehouse
- Built interactive Tableau dashboards with drill-down views to visualize key performance indicators
- Developed ML models to predict client outcomes, using feature engineering and scikit-learn for performance gains
- Automated routine data validation processes, boosting consistency and operational efficiency
Technical Stack:
- ETL: Apache Airflow, Python, SQL
- Visualization: Tableau
- ML: Scikit-learn, TensorFlow
- Database: MySQL, Salesforce, StarUML
- Built ETL pipeline for satellite data (TB-scale)
- Trained CV models (87% accuracy) for rural condition detection
- Contributed to ML project recognized in national competition
Key Achievements:
- Built computer vision models to detect rural environmental conditions, achieving 87% classification accuracy
- Designed and executed A/B tests to evaluate platform features and analyze user engagement
- Performed statistical analysis on usage data to uncover drivers of user retention and feature adoption
- Contributed to a nationally recognized ML project for environmental monitoring and policy insights
Technical Stack:
- Data Analysis: Excel(VBA, PivotTable, Aggregate Function), Python(Pandas, Seaborn, Matplotlib), SQL, A/B Testing
- Computer Vision: OpenCV, TensorFlow, PyTorch
- Project Management: Microsoft(Teams, Project)
- Languages: Python, SQL, HTML
Achievements:
- Top 10% for all semesters
- President of Youth Leader Club
- 3 Entrepreneurship Competition Top Awards
- Published research on fraud detection
- Maintained A's in all mathematics courses
Core Competencies:
- Data Structures & Algorithms
- Database Management Systems
- Machine Learning & AI
- Software Engineering
- Computer Architecture & Networks
- Built fraud detection model (99.07% accuracy, 1M+ txns)
- Engineered features & validated via out-of-time testing
- Published in MLBDBI 2020 (DOI: 10.1109/MLBDBI51377.2020.00025)
Key Achievements:
- Cleaned and transformed 1M+ financial transactions, resolving missing values and outliers
- Engineered features using statistical methods (SelectKBest) and ROC-AUCโdriven selection
- Applied out-of-time validation to simulate real-world fraud model deployment
- Optimized Different Decision Tree models (GBDT, LightGBM, XGBoost) with Bayesian hyperparameter tuning, reaching 99.07% accuracy
- Published research methodology and results in IEEE conference proceedings (DOI: 10.1109/MLBDBI51377.2020.00025)
Updated Project Details:
Plz view Projects PageSkills & Expertise
Certifications & Certificates
- Tableau Certified Data Analyst
- AI Literacy Certificate
- Graduate Leadership Certificate
- Data Analytics on AWS
- Applied AI Certificate
Programming Languages
- Python (Advanced)
- SQL (MySQL, NoSQL)
- Java & JSP
- C/C++
- MATLAB
- Shell Scripting
- MIPS Assembly
- HTML
Software Engineering
- Data Structures
- Algorithms
- Database Design
- API Integration
- Web Development
- Security Best Practices
Data Engineering
- ETL Pipeline Development
- Database Design & Normalization
- Data Warehousing
- Data Cleaning & Processing
- Data Ingestion & Storage
- Apache Spark & Kafka
- BigQuery
Machine Learning & AI
- TensorFlow
- Scikit-learn
- NLP & Computer Vision
- Predictive Modeling
- Feature Engineering
- Time Series Analysis
- Clustering (K-Means, KNN)
- PCA & t-SNE
Analytics & Statistics
- Statistical Analysis
- A/B Testing
- Sentiment Analysis
- Text Analysis
- Operations Research
- Linear Optimization
- Data Mining
Data Visualization
- Tableau (Certified Data Analyst)
- Matplotlib
- Plotly
- Streamlit
- Dashboard Design
- Interactive Visualizations
Databases & Big Data
- MySQL
- MongoDB
- NoSQL Databases
- Database Normalization
- Query Optimization
- Big Data Processing
- Data Warehouse Design
Cloud & DevOps
- AWS Cloud Services
- Cloud Computing
- Git Version Control
- Automated Pipelines
- Data Quality Monitoring
Python Libraries
- Pandas
- NumPy
- Scikit-Learn
- TensorFlow
- Matplotlib
- Anaconda
- pip
Business Tools
- MS Office Suite
- Excel (Advanced)
- UML Modeling
- Project Management
- Technical Documentation
Specialized Skills
- Geospatial Data Analysis
- Financial Fraud Detection
- Healthcare Analytics
- Remote Sensing
- OpenCV
- GenAI Applications
What Else
Beyond the world of data and code๏ผ
๐๏ธ I enjoy exploring beautiful trails, finding tranquility and perspective on weekend hiking adventures.
๐ When indoors, I like reading books that expand my horizonsโfrom data science literature to thought-provoking fiction.
๐ฎ I'm also an enthusiastic League of Legends player, where strategic thinking and teamwork offer a different kind of problem-solving challenge.
๐ฐ In my kitchen, you'll find me experimenting with baking techniques, applying the same precision and creativity that drives my professional work to create perfect pastries and breads.
My constant companion on life's adventures is:
๐ Wangwang Shen, my beloved dog!
โ๏ธ He made the incredible journey with me from China to the USA.
๐ฆ Wangwang's curiosity and joy remind me to appreciate the simple pleasures and find wonder in our surroundings, no matter how busy life gets.
๐ We really enjoy the US's friendly pet-environment.
To pet him and enjoy his fluffy tail, HIRE ME!
Contact
Feel free to reach out to me at yanna.cshen@gmail.com or connect with me on LinkedIn and GitHub. You can also access to the above links and download my resume from side bar.
Iโm here to help transform your data into valuable insights and profit!