What is a “Data Scientist”? Many of us may still be confused by this relatively new profession. The first use of the term “data science” has been traced back to Danish computer scientist Peter Naur in 1974. DJ Patil, the first Chief Data Scientist for the U.S. Government in 2015, is often credited with popularizing the profession.
So we sat down for a (virtual) chat with our Chief Data Scientist, Begatim Berisha, who, along with sharing his own background and motivations, provided some helpful clarifications on what exactly a data scientist does.
Data Scientists take raw data, clean it, process it and extract patterns from that data to inform a company’s decision making. Once the data is processed, we use algorithms to parse through that data and create predictive models that learn from that data. With enough inputs, you can build and train a machine learning app from scratch to do your work for you. A simple example: if you want a machine to recognize the word “yes,” you give it a lot of data of people saying “yes” so it can learn to recognize it on its own.
We spend 70% of our time preparing the data. Once it is ready, we start writing our algorithms to learn from that data. Once we’ve built the model, we feed a portion of the data into it and set aside the rest to validate the model's performance.
Data--the biggest problem a Data Scientist can have is bad data. Only a small portion of a model’s performance is attributed to it’s design and data processing; the rest relies completely on the quality of the data, and that is normally what data scientists don’t have much control over.
In our work with Attributy, we give our clients a tracking code that they deploy in their website to track activity and conversions. We ask them to use UTM tags when they advertise on a channel--that’s how we can track an individual campaign as opposed to the general traffic a site generates.
For example, let’s say a client has a campaign called “Summer Offer,” then they need to tag it with that UTM. However, if the same campaign also goes by “Summer,” or “Summer Sale,” then the data is not consistent and we can’t develop a clear picture of customer conversions. On the other hand, if our clients do not use UTM tags at all, a lot of information is lost and a model that relies on those tags for performance will become useless.
Interested in optimizing your campaigns with Attributy’s help? Sign up for free here
I received a Bachelor’s degree in Economics & Management from the Rochester Institute of Technology, and after college I worked as a Sales Channel Manager at a company called Studio Moderna. I had to do a lot of things, but one of them was dealing with databases. I had always loved math but I didn’t know what to do with it, but once I began working with these databases I figured it out. So when I started researching Master’s programs I found Data Science and knew it was a perfect fit. I got a Master’s at Tilburg University in the Netherlands and completed an internship while there before starting at Attributy.
Yes! For those who don’t know, a Hackathon is essentially an event where an organization invites Data Scientists to tackle a data science problem. At the NLB Hackathon I attended, the problem we had to solve was automating loan approval approval with AI, and finding a way to explain to each customer why their loan was approved or denied. The F1 score is a balanced measures the performance of the model, it goes from 0 to 1--the better the model is, the closer to 1 it will be. I placed in the top five with a F1 score of 0.9321.
I love dirt biking; after work I’ll jump on my bike and take a ride through the mountains.
I’d probably be an Architect, that’d be another way to use my math skills.
Interested in working for Attributy? Check out our open jobs