top of page

Summer AI Machine Learning Course & Exoplanets Project

  • Writer: Justin Brown
    Justin Brown
  • Jul 13, 2025
  • 3 min read

Updated: Dec 13, 2025


For two weeks of the summer I was doing a virtual AI summer course from Inspirit AI where I learned how AI models work and I along with a couple others made our own model that detects exoplanets. If you don't know what exoplanets are, they are any planet that is outside of our solar system and orbits a star.


Exoplanet detection project & training data

During the first week of the course, we learned how AI models worked and how to train them in Python. To do this we use known training data and testing data. To train the model we used real Kepler Space Telescope data that looks at light fluctuations from distant stars.


In order to tell if these fluctuations signal that there is an exoplanets orbiting the star is if the patterns show symmetry over a specific time period. This is because exoplanets orbit their star, meaning the light changes that occur when the planet passes in front of the sun (in reference to the telescope) should be during consistent period. Unlike meteors or asteroids (non-exoplanets), which would result in random and inconsistent light flux's.


However in order to make the data easier for the model to train on we have use different polishing techniques, like the SMOTE function. Because the training data has vastly more non-exoplanet's in it so it is easy for the model to become biased towards seeing non-exoplanets so in order to correct this we use smote. The SMOTE function creates synthetic data to even out the amount of each type of data in this case exoplanets and non-exoplanets. This allows the model to "learn" how to spot this differences in light fluctuation patterns in a more efficient way.


AI Model Types

There are a few different ways to train an AI model on something, the five different types we used were: decision tree's, linear regression, logistic regression, and k-nearest neighbor (kNN), and convolutional neural network (CNN). With the exception of decision trees, these four are considered machine learning type in which the model recognizes patterns that appear in the data.


CNN Model with only one false positive
CNN Model with only one false positive

Our other model we tried was a convolutional neural network (CNN). This model is the most advanced out of all them and is very good at finding differences in different images or sets of data. As expected, this one gave us the best results and was actually the only one that got all the testing and training data correct except for one false positive. This is not to say that the other types of models weren't good as they also showed positive results with only a few false positives and false negatives.


The one that is most concerning however is the false negatives. This is because there is such a small amount of exoplanets in real world data that it is extremely important the model does not miss the exoplanets. False positive however are not as bad because they should fairly easy to catch due to the small amount of positives there actually are.


Presentation

At the end of the course our group put together a presentation to show the rest of the class what we had been working on. This something all of the 12 groups at to do on the last day of the course. I found this course was super well put together and I learned a ton about AI models. I think it gave me a new appreciation for how complicated all of this truly is but I also found it to be super interesting and now I have some AI modeling experience!


Examples of Exoplanet vs Non-Exoplanet light graphs from our dataset
Examples of Exoplanet vs Non-Exoplanet light graphs from our dataset




Comments


© 2035 by Justin's Blog. Powered and secured by Wix 

bottom of page