Data Science Concepts for Science and Engineering Students

We got the support of four undergraduate students (Xu Chen, Timothy Odom, Alex Outkou, and Neel Surya) to work on this program this Summer as part of Mountaintop Program at Lehigh University. Happy to report a few milestones for this project as part of our project to use Bokeh for teaching science and engineering concepts.

The goal of this portion of the project was to introduce data science concepts to students in engineering. It covers five aspects:

(i) a data exploration section to understand the distribution of the datasets being used;

(ii) a correlation matrix section to show the correlation between all the features in the datasets. Inspired from Bokeh_CorrelationMatrix;

(iii) a multivariable regression section to build a customized regression model;

(iv) an unsupervised learning section covering clustering and principal component analysis;

and (v) a classification section to determine how datasets are partitioned into various “classes” .

Xu/Timothy/Alex/Neel were able to use example code as a template and utilizing detailed documentation of Bokeh and scikit-learn, were able to work independently throughout the project. It is a testament to drive of Xu/Timothy/Alex/Neel and how well documented Bokeh and scikit-learn are.

You can access this STEM exercise module at this website.

The SIR infectious disease model is being used as a tool in the classroom this Fall. Prof. Rangarajan received NSF CAREER award, where these modules were well received and positively reviewed.


@swamilikes2code ,

Thank you again for more incredible showcase examples from your students! It’s so exciting to see Bokeh used in the classroom like this. If any students have twitter handles I can tag in an @bokeh tweet about their work, let me know!


Thank you @carolyn .

The students do not have a Twitter handle. You can use mine at @raghulikes2code . I will share the students LinkedIn profiles, if you can post on LinkedIn too that would be great.

Here are the LinkedIn handles:





This work is being presented as a lightning talk this Thursday October 27th at 4:00 PM Eastern Time Zone or 2000 UTC @ PyDataGlobal.

Please drop by if you are interested!


Data science is a unique and powerful approach to solving all kinds of problems. It can be used for everything from improving customer experience to preventing fraud, and everything in between.

I like data science because it’s analytical, but I also like the human element. It’s about looking at a problem and figuring out how to solve it. It incorporates both sides of my brain.

I’m also really interested in big data, which is basically just a collection of large amounts of information that can be analyzed to help people make decisions. I’ve been thinking about it as an application for my career.