Thank you for participating in this blog series!
Tell us your name and what is your current title?
Vilhelm von Ehrenheim, Data Engineer at EQT.
What are you working on right now, something exciting you can share?
A model for competitor monitoring running on Google Dataflow.
How did you get where you are?
By chance and hard work. :) I think a career is an evolutionary process in some sense. Depending on who you meet and who you work with you will learn new things and change your priorities.
But more concretely, I did a MSc in Engineering Physics at Lund Institute of Technology specializing in Machine Learning. After that I worked as a Data Scientist at Klarna for four years before I joined the Motherbrain team at EQT.
What are the most interesting aspects of your current job?
That I have a ton of interesting things to learn and try out. I work in a small agile team that lets me focus on solving problems and implementing the solutions in a tech stack that is new and inspiring. EQT is a great place to work.
Is there anything you would have liked to know about being a Data Scientist before starting a career in this sector?
That it is not enough to be awesome at algorithms to be Data Scientist. You also need a good feel for the business and understand what you are actually trying to solve. I think the most common mistake new Data Scientists make is not defining proper goals for projects that tie into what the business needs, but rather just hunt for those extra accuracy points on a problem that is not very well defined in the first place.
What technologies do you believe will become the next ”big thing”, both in the short term and the long term?
I think Apache Beam will become more and more popular as stream processing is getting bigger. I think triggers and retractions are really nice features that makes it stand out. It is also somewhat agnostic to the underlying execution engine which makes it easier to avoid lock in to one specific setup or provider.
Lastly, do you have any interesting books in our sector to recommend for summer reading?
I think Designing Data-Intensive Applications by Martin Kleppmann is a must read for everyone that is interested in data processing.
For those that are interested in stream processing and what Apache Beam is all about I would recommend the blog series The word beyond batch: Streaming 101 and Streaming 102 by Tyler Akidau. Both easy to find on your favourite search engine.
Thanks again for participating Vilhelm!