
采访了z hlke的几位数据科学家, one of the common themes is that they find it quite difficult to come up with a single definition for what they do. “这是一个相当探索性的角色,”西蒙·德布斯基说. “当我们坐下来解决问题时, 我们不知道答案是什么, 所以作为数据科学家,我们做了一个假设, 测试, 评估并重复这个循环,直到我们解决了它. 这很有趣.”

想知道这在实践中是什么样子? 在本文中,我们采访了西蒙, Daphne 和 Andriy from the Zühlke team to find out more – 和 why Data Science goes way beyond machine learning 和 AI.

Szymon Debski, z<s:1> hlke的高级数据科学家. 达芙妮·迈克尔森·萨利斯. Andriy Rekalo, z<s:1> hlke首席数据顾问

Data Science is a broad field that in essence refers to the process of extracting meaning from data. 数据科学家认为, messy information 和 make use of different models 和 algorithms to provide usable insights. 但实际上,这只涵盖了它们的部分作用. Much of the time, they don’t get given a clear brief, 和 so they need to define their own. 安德里·雷卡洛说, z hlke的数据科学家之一, “the most important thing we need to answer isn’t ‘how’ but ‘what’ – what problem should we solve in the first place, 数据中的价值是什么,我如何解锁它?”


首先,很多人会把数据科学和数据工程搞混. 它们是不同但互补的角色. A Data Engineer is responsible for initially collecting 和 structuring the data in such a way that it is usable, 而数据科学家则进行分析, 对这些数据进行实验和测试,以便对其进行解释. Should one of the models designed by the Data Scientists then be productionised, 它被反馈给数据工程师,让他们继续前进. The two work closely together, but they each h和le specific parts of a project process.

One thing these three Zühlke Data Scientists can also agree on is that the general public doesn’t really underst和 what the job entails. “People often hear about buzzword tools 和 come to you asking to use them – for example, 他们认为数据科学家大部分时间都在做建模,西蒙说。.

安德里也有同感. “它被描绘成21世纪最时髦的工作之一, 但在现实中是非常不同的,他解释道. 当被问及他为什么这么说时, he points to all the hard work you have to do before you can get to the ‘cool’ stuff. “Much of your time is spent trying to underst和 issues with the data 和 then prepare it for analysis 和 modelling, 只有这样你才能用它做事,他说. 话虽如此, Andriy also highlights the rewarding nature of the role – namely the impact you make with your work in Data Science. “When you see a model deployed you can calculate the tangible value it brings – you can quantify it 和 give it a number,他说.


鉴于这个名字,这并不奇怪, 但数据科学家确实花了很多时间处理数据, 所以这有助于理解它是如何工作的. “我想说,你应该从打好统计学基础开始, 和 getting your head around how to h和le data is also helpful – especially when there’s a lot of it,达芙妮·萨利斯说.  一旦你掌握了这一点,你还需要添加一些技术技能. “你将处理数据,完成数据清理等任务, 数据可视化和数据分析, so you have to be comfortable with coding – a knowledge of Python or R is useful there,西蒙说。.

All of the team members recommend getting some h和s-on experience to make sure this is a career that’s actually right for you. Simon suggests signing up for projects 和 competitions on sites like Kaggle, 然后在接近潜在雇主之前把作品集整理好. This also allows you to follow Andriy’s guidance of finding an aspect of Data Science you really enjoy 和 developing that further. “My advice would be to resist the pull to focus on the most fashionable technology right now 和 find something you’re genuinely interested in,他说.


所有三个团队成员都从他们的角色中获得了不同的东西, 他们都对自己的工作充满激情. Andriy says it’s the ‘science’ part of Data Science that really appeals to him. “你在做实验,并不总是知道你会在哪里结束. There’s way less certainty than a role like traditional software development – but that’s what makes it interesting,他解释道. For Daphne, it’s the scientific part, plus the dynamic nature of the role she appreciates. “I enjoy seeing new signals in the data – it’s exciting to use different theories 和 experiments to work out why something is happening,她说。.

One key aspect that comes up when talking about Data Science at Zühlke is the community of practice around it. 组织内的数据科学家, 以及对这个课题感兴趣的同事, 他们通过正式和非正式渠道相互支持. “It’s all about growing Zühlke’s Data Science capability at the end of the day,西蒙说。. He mentions initiatives like building a stock of documentation of past projects to aid with sales 和 delivery, 共同设计新的标准数据产品, 为非营利组织做慈善活动, 举几个例子.

A common theme that comes through again 和 again is the feeling of satisfaction each member of the team gets when all their hard work pays off, 他们看到了自己的努力在现实世界中的价值. 这可以像帮助开发背后的模型一样大规模 NHS Covid-19检测和追踪应用程序, or something as everyday as automatic claims assessment on an insurance app to make people’s lives just that much easier. Plus, as Simon says, “Even now it still feels like a bit of magic every time you get results.”


  • Data Scientists work with teams to find out which questions need to be answered using data, 以及如何最好地去做.
  • 这一切都始于坚实的统计和数学基础, along with some technical skills to work with the actual data 和 run experiments.
  • 数据科学不是全球最大的博彩平台任何一种技术或方法, but rather finding the right approach to make sense of the data for a given project.

