sharp
This paper forecasts green-skill demand from 204,373 Mexican auto-industry skill records, but I do not buy the precision story yet.
The pipeline is sensible. The authors use postings from Indeed Mexico, OCC Mundial, and LinkedIn, spanning July 2024 to July 2025. They extract 204,373 skill records, apply multilingual embeddings, validate against ESCO, and identify 274 green skills across 8,576 mentions. Green skills are 4.22% of all extracted skills. For labor-market analytics, that setup is useful. Job ads mix Spanish, English, supplier jargon, and translated competency phrases. A static keyword list misses variants like recycling operations, renewable-energy systems, and waste-management compliance.
The weak point is sample geometry. The 8,576 green-skill mentions are split across 274 skills and a 12-to-13-month window. Many resulting time series will be sparse. The abstract says FEDformer, Reformer, and Informer lead among 15 forecasting models, with MAE around 2.5e-5 and relative RMSE below 15. That sounds clean, but the abstract does not disclose the target normalization, the time granularity, the minimum support per skill, or the exact rolling-origin split. On low-base-rate series, MAE can look excellent when a model mostly predicts values near zero. Green skills are only 4.22% of the extracted skill universe, so a tiny absolute error does not automatically translate into a reliable workforce signal.
I have stronger concerns about labeling than forecasting. ESCO is a European skills taxonomy. Applying it to Mexico’s automotive sector introduces domain transfer risk. Mexican auto postings are shaped by North American OEMs, tier-1 suppliers, quality systems, maintenance roles, and manufacturing-engineering language. Those ads may not express “green” work in ESCO-native terms. Embeddings help, but they also blur nearby concepts. “Lean manufacturing,” “energy efficiency,” “waste reduction,” and “process optimization” can sit close in embedding space while carrying different labor-policy meanings. The abstract does not give manual-label agreement, precision and recall, negative-sample audits, or a confusion analysis. Without those, the 274-skill inventory is hard to trust.
The external comparison here is the Burning Glass, Lightcast, and OECD job-posting analytics line of work. Those systems spend a lot of effort on deduplication, employer normalization, reposting behavior, seniority detection, and occupational mapping. This abstract names three platforms, but it does not say how duplicate ads were handled across LinkedIn, Indeed Mexico, and OCC Mundial. A single role reposted on two sites can inflate skill mentions. A company refreshing the same ad weekly can look like rising demand. For a macro labor signal, that is not a minor cleaning issue.
The model choice also smells over-engineered. FEDformer, Informer, and Reformer were designed for long-sequence forecasting settings. This dataset covers July 2024 to July 2025. If the authors aggregated monthly, the sequence length is tiny. If they aggregated weekly or daily, the abstract should say so. Benchmarking 15 time-series models under a short window can become a ritual rather than evidence. I would want to see strong naive baselines, ARIMA, Prophet, and a lagged LightGBM setup. If the Transformer family beats those under rolling-origin evaluation, then I care. If the gain is only against other deep models, the result is much less useful.
The stronger part is the growth-classification framework. The authors classify skills by absolute and relative growth, then separate stable, emerging, and high-impact competencies. They report that current demand concentrates in operational sustainability practices, while faster growth appears in renewable energy, recycling, and hydrogen technologies. That is a better product than a next-period forecast. Automotive green transition is not only EV engineering. It hits plant energy management, waste handling, supplier compliance, recycling loops, and hydrogen-adjacent maintenance. Still, for training policy, the paper would need region, occupation, seniority, salary, and firm-type cuts. The abstract does not disclose those.
My read: this is a decent pipeline demo, not yet a decision system. Multilingual embeddings plus ESCO validation can extract green-skill signals from messy postings. Rolling-origin forecasting can rank model families. But a 2.5e-5 MAE does not carry much operational weight until the paper shows deduplication rules, label-quality audits, support thresholds, and baseline comparisons. If I were using this inside a workforce-planning team, I would ask for three tables before trusting the conclusion: duplicate-removal impact, per-skill sample counts by time bucket, and human validation of the green-skill labels. Without those, the model score is an academic artifact, not a labor-market instrument.