Based on users’ large-scale app installation data, we established an automated process for T-mobile to segment their users and create persona labels with machine learning. This process will improve mobile advertising services. We conducted social network analysis to cluster users, did topic modeling (LDA) for apps, extracted distinctive keywords in each community with linear sum assignment to build user personas The persona is able to show multi-dimension characteristics of mobile audiences such as popular apps, genres, topics within each community.
Created the Datawarehouse for Sparkfiy User Acitivity Log on Redshift, Configured Airflow for scheduling, Using Spark to predict User Retention and Connected the Results with Tableau Dashboard
Dimesional Model & Tableau: Design for XYZ retailer companies Sales BI systems, from dimensional model design, automating table staging from source tables to target tables, data cleaning and preprocessing within SSIS/SSMS, and Tableau Designing and Business Analysis.
Ridge Regression / Random Forest: Aiming to provide house owner an appropriate airbnb rent rate estimation
Crack the data challenge within one month! Covering fraud detection, user segmentation, A/B testing, Recommendation system based on Clustering, etc,.
Tableau: Data visulization of user activity tracking for dognition company to analyze the marketing strategy
Logistics Regression / Random Forest / KNN: Aiming for identifying customers who are likely to stop using service in the future with the analysis of top factors that influence user retention.
Using HMTL/D3.js to realize a heatmap with the popular rating