About Peak's Decision Intelligence Community
Hi Team,I am currently pursuing Masters in Data Science at the University of Manchester and due to graduate in September-2022. I am looking for a DS internship to grasp the real-world problem-solving aspect of Data Science. I have some knowledge of it as part of assignments and coursework, but that’s not enough to get me started or be hired for a direct recruit role in Data Science. I want to experience hands-on training and land a Graduate/Junior role in the same, which can be possible via an internship. Kindly let me know if Peak offers any remote internships or similar learning opportunities.Looking forward to learning more,Thanks and regards,Debashree Tripathy
Live on Thursday 7 July 2022 at 12:00 BST Katie King is a published author, keynote speaker, trainer and consultant on Artificial Intelligence (AI), digital, STEM, leadership and business transformation.Katie has over 30 years of consulting experience and has advised many of the world's leading brands and business leaders.AuthorHer first book, Using Artificial Intelligence in Marketing: How to Harness AI and Maintain the Competitive Edge, was translated into Russian, Chinese, Vietnamese & other languages. It was also listed as a reference source in the 'brand strategy' section of the World Economic Forum's Empowering AI Leadership AI toolkit for corporate boards.Her second book was published by Kogan Page in January 2022: AI Strategy for Sales and Marketing: Connecting marketing, sales and customer experience.AdvisorKatie is a member of the UK Government All-Party Parliamentary Group (APPG) task force for the adoption of AI. She is also an Editorial Board Member for the AI and Eth
Is the data science term “AB test” just a rebranding of what scientists have been calling “an experiment” for hundreds of years?
I’d like to hear everyone’s biggest mistake, regret or oversight as they were learning data science. Hopefully we can learn from eachother’s experiences and help people currently starting to learn data science!
Most of the way through “The beginners guide to clean data” and loving it! Such a broad book, covering all aspects of data cleaning. Written in short, clear, digestible sections.There’s a free version here, but you can buy an E-book copy (which I did the support the author, it’s just that good)https://b-greve.gitbook.io/beginners-guide-to-clean-data/
Looking back on your career as a DS and how you got into the role, whats the one thing you wish you knew right at the start that would’ve made becoming the amazing DS you are now easier?
Consider a scenario where you're expected to solve a very tricky probability problem but you don't know how to solve it or another scenario where the probability problem requires a specific domain knowledge in which you're not an expert. Monte Carlo Simulation will come to your rescue in such scenarios, it is a method in which we simulate the random experiment using computational algorithms. It is usually a much simpler method to find the required probability compared to the theoretical (or mathematical) methods, however it is not as accurate as the mathematical method and it can be slow & computationally expensive.Lets understand Monte Carlo Simulation using examples!Lets start with one of the simplest and most commonly sited example of Monte Carlo Simulation and once we get a hang of it. We'll solve a tricky problem using the same technique.https://www.linkedin.com/posts/fazil-mohammed-4062711b2_monte-carlo-simulations-activity-6945626961729712128-tcgU?utm_source=linkedin_share
Live on Tuesday 21 June 2022 at 12:00 BST Ask Me Anything with Helen Craven, graduate data scientist at Peak. joins us for an Ask Me Anything (AMA) session. Introducing our guest: Helen Craven You’ve qualified, motivated and ready to be a data scientist. But how do you land your first role? Helen Craven is here to help.Helen joined Peak as a graduate data scientist just over two months ago and she’s been doing great things with data ever since. She’s the perfect person to put your early career data science questions to.How do you find entry-level roles? What does a great data science CV look like? Whatever your question, join Helen at 12:00 BST on Tuesday 21 June 2022 to ask her anything. Submit your questions Whatever your question, you can submit your questions now below and then join us on Tuesday 21 June 2022 to be a part of the discussion.
❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️Research Collection ThreadThread of the coolest papers in forecasting ⛄️🌨️️️️🌨️️️️Add any hot forecasting research you come across! 🔥🔥🔥Please add a TLDR, a link to the paper, and ideally any relevant code base.❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️❄️
In clustering algorithms like K-Means there's a commonly quoted example which goes as follows. Imagine there are 5 clusters of people, as a business owner we need to open 5 new food joints in order to cater to these people. K-Means helps us in strategically placing our food joints in such locations so as to reduce the overall time spent by these people to commute to the nearest joint (For simplicity, we're going to ignore the fact that the path between any 2 points in real world is rarely a straight line).Though this helps us a great deal to wrap our head around the concept of K-Means, it is technically wrong. Centroid minimizes the sum of squares of distances of all the points from it which is not equivalent to minimizing the sum of distances of all the points from it. The point which minimizes the sum of distances of all the points from it is called the geometric median. Centroid (or mean) is very easy to compute, we just have to add all the points (vector addition) and divide it by
This might be a bit old school, but what is your preference for visualisation and communication between Microsoft PowerPoint and Google Slides? I’ve used both, I like the amount of tools Google Slides provides, however I think PowerPoint has an overall better user experience. What do you think?
The First Principal ComponentPCA is the most popular dimensionality reduction technique (IMHO it is basically a data transformation technique, viewing the same data using a better choice of coordinate system). In this write up, we're going to clear a common misconception about the very first principal component.In almost every lecture or article explaining PCA, they refer to the first principal component as the 'line of best fit' for the data (not that I completely disagree). They mention the above statement as a passing remark, but 'line of best fit' is a hyperbole we commonly use in the context of OLS, do they actually mean it in that sense? (some people do mistake it to be in the sense of OLS method, which is wrong on so many levels). This statement warrants a clear explanation without which it can lead to serious misconception. Lets first see where this particular idea stems from.You can find the full writeup and notebook here https://www.linkedin.com/posts/fazil-mohammed-4062711b2
Mostly people store data in CSV, but sometimes the data can be so large it takes a long time to read in. I hear that because they are smaller, parquet files are faster to read in, letting you get into the data faster. the downside I guess is that you can’t quickly open the file in excel to check something, but for most big data you’ll rarely do that anyway. Assuming:I have some large data files that I use regularly they are about 3 million rows so can be slow to read in I never need to open them in excelWhen shouldn’t I be using parquet?
Like any other ML algorithm, KMeans clustering also has its own set of pros and cons. Here we're going to discuss a specific kind of disadvantage of this algorithm which is the fact that what appears to be a 'natural cluster' for a human eye is not necessarily what appears to be a cluster from a machine's perspective. #ml #clustering #kmeans #machinelearning #datascience #unsupervisedlearninghttps://www.linkedin.com/posts/fazil-mohammed-4062711b2_understanding-cluster-formation-in-kmeans-activity-6938104670510809088-YGhl?utm_source=linkedin_share&utm_medium=member_desktop_webI couldn’t find an option to upload the pdf. Hence I shared the above link of my post in linkedin
Many people leave academia for a role in data science, often from a wide variety of subject areas. Did you make the move from academia into data science? What did you find easy/hard about the transition? Is there any advice you’d give to others in a similar position? I’ve written a blog on this topic if you want to read my take!
We all want to build data science solutions that get used by customers to make substantial improvement to their organisational processes. But sometimes even the best solutions with incredible technical fundamentals can fail at the acceptance or implementation phasesWhat are the biggest challenges and blockers to implementation?And what (if anything) can be done about them?One example is when the major stakeholder just doesn’t agree with or understand what you’re doing, and no amount of explanation is helping (yes, models are imperfect representations of reality. No, prediction outliers are not as important as you’re making them out to be: look at the typical behaviour and overall KPI uplift). It sometimes helps if you can find another senior stakeholder that this person trusts to buy into the project. Another potential route is to get the end users to fall in love with how much more productive and happy the solution is going to make them. There is a huge opportunity cost in trying to c
I’m trying to make a copy of a visualisation of a table from Excel in R so that it can be automated as the original version is very labour intensive. From a google it seems formattable is the best package to achieve this, but I can’t find any documentation for adding lines between columns. Does anyone have experience with this package, and know how to do that?
Already have an account? Login
No account yet? Create an account
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.
Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.
Sorry, our virus scanner detected that this file isn't safe to download.