Teacher ➡️ Data analyst
Taking complexity and transform it into beautiful simplicity
View My LinkedIn Profile
The old saying, “age is nothing but a number,” can hold a different meaning in the workplace. As an intern in IBM’s Data Analyst team, I found myself questioning the significance of age within our professional realm. Recent waves of employees departing from our organization have brought age into the spotlight, making me wonder: Could it be more than just a number when it comes to attrition? Using the statistical power of R, I aim to determine whether age has a significant statistical impact on attrition within our organization.
The dataset at hand, crafted by IBM’s data experts, provides valuable insights. It consists of 1470 rows, each representing an employee, and boasts 35 columns filled with information.
At the heart of our exploration is the “Attrition” column. It tells us whether an employee left or stayed. While we’re not entirely sure if “Attrition” includes voluntary departures, layoffs, or both, for our analysis, we’ll focus on voluntary exits.
With this data, my mission is clear: to uncover the connection between age and attrition within our organization. While we’ve heard that younger employees tend to leave more, we want to dig deeper. We’re looking for the details that can shape our HR strategy, ensuring we keep our team engaged and strong.
Our initial exploration of employee age distribution reveals a fascinating pattern.
The histogram of age data shows a positive skew, indicating that a significant portion of our workforce consists of younger individuals. The majority of our employees, with a peak frequency occurring at the lower ages, suggest that our organization has a predominantly younger workforce, with an average age of approxiately 37.
Digging deeper, we explored correlations between various factors.
Notably, we discovered a strong correlation between age and total working years, suggesting that as employees grow older, their total years of service tend to increase. Furthermore, age and total monthly income also exhibit a positive correlation, albeit with some exceptions. This implies that, in general, older employees tend to have higher monthly incomes.
The crux of our investigation centers around the relationship between age and attrition. Initial observations from box plots hint at a difference in attrition rates between age groups.
To rigorously test this, we conducted a t-test, and the results were striking. The extremely low p-value (e.g., 1.38e-08) provided robust statistical evidence against the null hypothesis. We confidently reject the null hypothesis and accept the alternative hypothesis that age and attrition are indeed interconnected.
To strengthen the credibility of our findings, we took a closer look by categorizing employees into distinct age groups and subjecting them to a rigorous Chi-Square test.
The results were nothing short of astonishing, revealing an incredibly low p-value, approximately 4.341e-11. This outcome underscores a substantial and non-random connection between age groups and attrition within our organization. In practical terms, it indicates that attrition rates exhibit significant variations across different age groups in our dataset, firmly establishing age as a pivotal factor when deciphering the intricacies of attrition patterns.
To provide additional clarity, let’s delve into the visualization:
These color cues offer a straightforward way to understand the attrition status within each age group. But what does it mean when you see approximately 20% for both the blue and red bars within each age group? It signifies a vital insight:
What’s truly interesting is that only two age groups, the 26-35 and 56-60 brackets, appear in red. One could speculate on possible reasons for this intriguing pattern.
It’s conceivable that the 26-35 age group may find themselves in a relatively comfortable position in life, less inclined to explore new opportunities, and perhaps more content in their roles. Conversely, the 56-60 age group, nearing retirement age, may be firmly established in their careers, making the prospect of leaving less appealing.
However, it’s the data concerning the 36-45 to 45-55 age groups that raises eyebrows and concerns. These cohorts represent some of our more seasoned employees, individuals with valuable experience and expertise. The fact that they are opting to leave warrants a closer look and raises important questions about what might be driving this unexpected trend.
As we delve deeper into the relationship between age and various aspects of employment, we turn our attention to predictive modeling. Specifically, we’ve created a linear regression model that predicts MonthlyIncome based on age. The power of R makes this task remarkably straightforward. We utilize the ‘lm’ function, short for linear model, to build our predictive model.
To gain insights into the model’s performance, we invoke the ‘summary’ function with ‘summary(model1)’. The results provide us with valuable information, including the R2 value, which quantifies the model’s explanatory power.
In our case, the R2 value is 0.2479, indicating that age can explain approximately 25% of the variance in MonthlyIncome. Moreover, the model’s p-value is nearly 0, signifying its statistical significance with 95% confidence. This model illustrates how age alone can serve as a predictor of MonthlyIncome, but it’s just the beginning. In future analyses, we can explore more complex models with multiple variables, paving the way for a deeper understanding of the factors influencing our workforce.
In our quest to understand the dynamics of income within our workforce, we’ve ventured further into the realm of predictive modeling. Building upon our initial linear regression model, which solely considered ‘Age’ as a predictor of ‘MonthlyIncome,’ we’ve taken a more comprehensive approach. In this new model, named ‘model2,’ we incorporated ‘TotalWorkingYears’ alongside ‘Age’ as predictors.
The results are striking: the model’s R-squared value has surged to 0.5988, signifying a substantial improvement in its ability to make clear the variations in ‘MonthlyIncome.’ Equally noteworthy is the remarkably low p-value of 2.2e-16, reaffirming the model’s resounding statistical significance. This expansion of our analysis underscores the intricate interplay between age, total working years, and monthly income within our organization, shedding light on factors that contribute significantly to earnings
These findings have profound implications for our HR strategy. It’s clear that age plays a pivotal role in attrition within our organization. While younger employees may be more prone to attrition, this insight allows us to take proactive measures to mitigate it. By crafting targeted HR policies and initiatives, we can foster a more engaged and loyal workforce across all age groups, ultimately enhancing our organizational stability and success.
In conclusion, our analysis of the relationship between age and attrition within our organization has yielded several key findings. We have established that age plays a significant role in attrition, with younger employees exhibiting higher turnover rates. Additionally, age correlates with other employment-related factors, such as total working years and monthly income. Our organization’s age distribution, characterized by a predominantly younger workforce, further underscores the importance of age in our HR strategy. Predictive modeling has illuminated age as a predictor of monthly income, and the inclusion of total working years enhances this predictive power. Notably, intriguing attrition patterns emerged, with only the 26-35 and 56-60 age groups showing significant proportions of employees who stayed. This calls for a closer examination of the unexpected attrition trends among the 36-45 to 45-55 age groups. To address these findings, we recommend the implementation of age-diverse retention strategies, continuous monitoring, tailored development programs, open communication channels, support for employees nearing retirement, investment in data analytics, and a holistic focus on employee well-being. These recommendations aim to harness the insights gained from our analysis, fostering a stronger, more resilient workforce and securing a brighter future for our organization.
I’m always on the lookout for data analyst opportunities. Know of any? Don’t hesitate to reach out at smitchellbest@gmail.com. Make sure to explore more of my portfolio. Just click the button below.