| Type: | Package | 
| Title: | What Skills and Qualifications are Required for Data Science Related Jobs? | 
| Version: | 2.0.0 | 
| Maintainer: | Thiyanga S. Talagala <ttalagala@sjp.ac.lk> | 
| Description: | Dataset containing information about job listings for data science job roles. | 
| License: | CC BY 4.0 | 
| URL: | https://github.com/thiyangt/DSjobtracker | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| LazyDataCompression: | xz | 
| RoxygenNote: | 7.2.3 | 
| Depends: | R (≥ 3.5.0) | 
| Suggests: | knitr, rmarkdown, tibble, tidyr, ggplot2, dplyr, magrittr, testthat, wordcloud2, forcats, viridis | 
| VignetteBuilder: | knitr | 
| NeedsCompilation: | no | 
| Packaged: | 2023-12-09 07:21:35 UTC; thiyangashaminitalagala | 
| Author: | Thiyanga S. Talagala | 
| Repository: | CRAN | 
| Date/Publication: | 2023-12-09 07:40:02 UTC | 
Data Scientists/Data Analyst/ Statistician Job Tracker
Description
Job advertisements
Usage
DSraw
Format
A data frame with 551 rows and 152 variables
- ID
- row id 
- Consultant
- Name of the consultant 
- DateRetrieved
- Date of Data Retrieved 
- DatePublished
- Published Date of the Advertisement 
- Job_title
- Name of the job category 
- Company
- Name of the Company 
- R
- If R is required -> 1 ,If not mentioned -> 0 
- SAS
- If SAS is required -> 1 , If not mentioned -> 0 
- SPSS
- If SPSS is required -> 1 , If not mentioned -> 0 
- Python
- If Python is required -> 1 , If not mentioned -> 0 
- MAtlab
- If Matlab is required -> 1 , If not mentioned -> 0 
- Scala
- If Scala is required -> 1 , If not mentioned -> 0 
- C#
- If C# is required -> 1 , If not mentioned -> 0 
- MS Word
- If knowledge in MS Word is required -> 1 , If not mentioned -> 0 
- Ms Excel
- If knowledge in MS Excel is required -> 1 , If not mentioned -> 0 
- OLE/DB
- If knowledge in OLE/DB is required -> 1 , If not mentioned -> 0 
- Ms Access
- If Ms Access is required -> 1 , If not mentioned -> 0 
- Ms PowerPoint
- If knowledge in Ms Powerpoint is required -> 1 , If not mentioned -> 0 
- Spreadsheets
- If knowledge in Spreadsheets is required -> 1 , If not mentioned -> 0 
- Data_visualization
- If knowledge inData Visualization is required -> 1 , If not mentioned -> 0 
- Presentation_Skills
- If Presentation Skills are required -> 1 , If not mentioned -> 0 
- Communication
- If Communication skills are required -> 1 , If not mentioned -> 0 
- BigData
- If knowledge in Big Data analysis is required -> 1 , If not mentioned -> 0 
- Data_warehouse
- If knowledge in Data Warehouse is required -> 1 , If not mentioned -> 0 
- cloud_storage
- If knowledge in Cloud Storage is required -> 1 , If not mentioned -> 0 
- Google_Cloud
- If knowledge in Google Cloud is required -> 1 , If not mentioned -> 0 
- AWS
- If knowledge in AWS is required -> 1 , If not mentioned -> 0 
- Machine_Learning
- If knowledge in Machine Learning is required -> 1 , If not mentioned -> 0 
- Deep Learning
- If knowledge in Deep Learning is required -> 1 , If not mentioned -> 0 
- Computer_vision
- If knowledge in Computer Vision is required -> 1 , If not #' mentioned -> 0 
- Java
- If Java is required -> 1 , If not mentioned -> 0 
- C++
- If C++ is required -> 1 , If not mentioned -> 0 
- C
- If C is required -> 1 , If not mentioned -> 0 
- Linux/Unix
- If knowledge in Linux/Unix is required -> 1 , If not mentioned -> 0 
- SQL
- If SQL is required -> 1 , If not mentioned -> 0 
- NoSQL
- If NoSQL is required -> 1 , If not mentioned -> 0 
- RDBMS
- If knowledge in RDBMS is required -> 1 , If not mentioned -> 0 
- Oracle
- If knowledge in Oracle is required -> 1 , If not mentioned -> 0 
- MySQL
- If MYSQL is required -> 1 , If not mentioned -> 0 
- PHP
- If PHP is required -> 1 , If not mentioned -> 0 
- Flash_Actionscript
- If knowledge in Flash Action Script is required -> 1 , If not mentioned -> 0 
- SPL
- If knowledge in SPL is required -> 1 , If not mentioned -> 0 
- web_design_and_development_tools
- If knowledge in Web Design and Development Tools is required -> 1 , If not mentioned -> 0 
- Wordpress
- If knowledge in Wordpress is required -> 1 , If not mentioned -> 0 
- AI
- If Artificial Intelligence is required -> 1 , If not mentioned -> 0 
- Natural_Language_Processing(NLP)
- If knowledge in NLP is required -> 1 , If not mentioned -> 0 
- Microsoft Power BI
- If knowledge in Microsoft Power BI is required -> 1 , If not mentioned -> 0 
- Google_Analytics
- If knowledge in Google Analytics is required -> 1 , If not mentioned -> 0 
- graphics_and_design_skills
- If Graphic and Design Skills are required -> 1 , If not mentioned -> 0 
- Data_marketing
- If Data Marketing abillity is required -> 1 , If not mentioned -> 0 
- SEO
- If knowledge in SEO is required -> 1 , If not mentioned -> 0 
- Content_Management
- If knowledge in Content Management is required -> 1 , If not mentioned -> 0 
- Tableau
- If knowledge in Tableau is required -> 1 , If not mentioned -> 0 
- D3
- If knowledge in D3 is required -> 1 , If not mentioned -> 0 
- Alteryx
- If knowledge in Alteryx is required -> 1 , If not mentioned -> 0 
- KNIME
- If knowledge in KNIME is required -> 1 , If not mentioned -> 0 
- Spotfire
- If knowledge in Spotfire is required -> 1 , If not mentioned -> 0 
- Spark
- If knowledge in Spark is required -> 1 , If not mentioned -> 0 
- S3
- If knowledge in S3 is required -> 1 , If not mentioned -> 0 
- Redshift
- If knowledge in Redshift is required -> 1 , If not mentioned -> 0 
- DigitalOcean
- If knowledge in Digital Ocean is required -> 1 , If not mentioned -> 0 
- Javascript
- If Java Script is required -> 1 , If not mentioned -> 0 
- Kafka
- If knowledge in Kafka is required -> 1 , If not mentioned -> 0 
- Storm
- If knowledge in Storm is required -> 1 , If not mentioned -> 0 
- Bash
- If knowledge in Bash is required -> 1 , If not mentioned -> 0 
- Hadoop
- If knowledge in Hadoop is required -> 1 , If not mentioned -> 0 
- Data_Pipelines
- If knowledge in Data Pipelines is required -> 1 , If not mentioned -> 0 
- MPP_Platforms
- If MPP Platforms is required ->1,If not mentioned-0 
- Qlik
- If Qlik is required ->1,If not mentioned ->0 
- Pig
- If Pig is required ->1,If not mentioned ->0 
- Hive
- If Hive is required ->1,If not mentioned ->0 
- Tensorflow
- If Tensorflow is required ->1,If not mentioned ->0 
- Map/Reduce
- If Map/Reduce is required ->1,If not mentioned ->0 
- Impala
- If Impala is required ->1,If not mentioned ->0 
- Solr
- If Sloris required ->1,If not mentioned ->0 
- Teradata
- If Teradata is required ->1,If not mentioned ->0 
- MongoDB
- If MonoDB is required ->1,If not mentioned ->0 
- Elasticsearch
- If Elasticsearch is required ->1,If not mentioned ->0 
- YOLO
- If YOLO is required-1 ,If not mentioned-0 
- agile execution
- If agile execution is required->1 ,If not mentioned->0 
- Data_management
- If the knowledge in data management is required->1 ,If not mentioned->0 
- pyspark
- If pyspark is required->1 ,If not mentioned->0 
- Data_mining
- If the knowledge in data mining is required->1 ,If not mentioned->0 
- Data_science
- If the knowledge in data science is required->1 ,If not mentioned->0 
- Web_Analytic_tools
- If the knowledge in Web Analytic tools is required->1 ,If not mentioned->0 
- IOT
- If IOT is required->1 ,If not mentioned->0 
- Numerical_Analysis
- If the knowledge in Numerical Analysis is required->1 ,If not mentioned->0 
- Economic
- If the knowledge in Economic is required->1 ,If not mentioned->0 
- Finance_Knowledge
- If Finance_Knowledge is required->1 ,If not mentioned->0 
- Investment_Knowledge
- If Investment Knowledge is required->1 ,If not mentioned->0 
- Problem_Solving
- If the ability of Problem Solving is required->1 ,If not mentioned->0 
- Korean_language
- If the ability of speaking Korean language is required->1 ,If not mentioned->0 
- Bash\Linux Scripting
- If Bash\ Linux Scripting is required->1 ,If not mentioned->0 
- Knowledge_in
- Required knowledge to do a particular job ,If not mentioned->NA 
- Experience
- Minimum experience required for a particular job 
- City
- City where the company is located in 
- Location
- Country where the company is located in 
- Educational_qualifications
- Required educational qualifications 
- Salary
- Amount of salary 
- Team_Handling
- If the ability of Team Handling is required-1 ,If not mentioned-0 
- Debtor_reconcilation
- If the ability of Debtor reconciliation is required-1 ,If not mentioned-0 
- Payroll_management
- If the ability of Payroll management is required-1 ,If not mentioned-0 
- Bayesian
- If Bayesian knowledge is required-1 ,If not mentioned-0 
- Optimization
- If Optimization knowledge is required-1 ,If not mentioned-0 
- Bahasa Malaysia
- If Bahasa Malaysia is required-1 ,If not mentioned-0 
- English proficiency
- If English proficiency is required-1 ,If not mentioned-0 
- URL
- Web address of a particular job advertisement 
- Search_Term
- web search term of a particular job advertisement 
- X109
- Columns with null values 
- X110
- Columns with null values 
- X111
- Columns with null values 
- X112
- Columns with null values 
- X113
- Columns with null values 
- X114
- Columns with null values 
- X115
- Columns with null values 
- X116
- Columns with null values 
- X117
- Columns with null values 
- X118
- Columns with null values 
- X119
- Columns with null values 
- X120
- Columns with null values 
- X121
- Columns with null values 
- X122
- Columns with null values 
- X123
- Columns with null values 
- X124
- Columns with null values 
- X125
- Columns with null values 
- X126
- Columns with null values 
- X127
- Columns with null values 
- X128
- Columns with null values 
- X129
- Columns with null values 
- X130
- Columns with null values 
- X131
- Columns with null values 
- X132
- Columns with null values 
- X133
- Columns with null values 
- X134
- Columns with null values 
- X135
- Columns with null values 
- X136
- Columns with null values 
- X137
- Columns with null values 
- X138
- Columns with null values 
- X139
- Columns with null values 
- X140
- Columns with null values 
- X141
- Columns with null values 
- X142
- Columns with null values 
- X143
- Columns with null values 
- X144
- Columns with null values 
- X145
- Columns with null values 
- X146
- Columns with null values 
- X147
- Columns with null values 
- X148
- Columns with null values 
- X149
- Columns with null values 
- X150
- Columns with null values 
- X151
- Columns with null values 
- X152
- Columns with null values 
Source
Collected and entered by BSc (Hons) Statistics undegraduates - 2020
Examples
data(DSraw)
head(DSraw)
summary(DSraw)
Data scientists, data analyst, and statistician job advertisements from 2020 to 2023
Description
A dataset with 1172 rows and 109 variables
Usage
data(DStidy)
Details
- ID. row id 
- Consultant. Name of the consultant 
- DateRetrieved. Date of Data Retrieved 
- DatePublished. Published Date of the Advertisement 
- Job_title. Name of the job category 
- Company. Name of the Company 
- R. If R is required -> 1 ,If not mentioned -> 0 
- SAS. If SAS is required -> 1 , If not mentioned -> 0 
- SPSS. If SPSS is required -> 1 , If not mentioned -> 0 
- Python. If Python is required -> 1 , If not mentioned -> 0 
- MAtlab. If Matlab is required -> 1 , If not mentioned -> 0 
- Scala. If Scala is required -> 1 , If not mentioned -> 0 
- C#. If C# is required -> 1 , If not mentioned -> 0 
- MS Word. If knowledge in MS Word is required -> 1 , If not mentioned -> 0 
- Ms Excel. If knowledge in MS Excel is required -> 1 , If not mentioned -> 0 
- OLE/DB. If knowledge in OLE/DB is required -> 1 , If not mentioned -> 0 
- Ms Access. If Ms Access is required -> 1 , If not mentioned -> 0 
- Ms PowerPoint. If knowledge in Ms Powerpoint is required -> 1 , If not mentioned -> 0 
- Spreadsheets. If knowledge in Spreadsheets is required -> 1 , If not mentioned -> 0 
- Data_visualization. If knowledge in Data Visualization is required -> 1 , If not mentioned -> 0 
- Presentation_Skills. If Presentation Skills are required -> 1 , If not mentioned -> 0 
- Communication. If Communication skills are required -> 1 , If not mentioned -> 0 
- BigData. If knowledge in Big Data analysis is required -> 1 , If not mentioned -> 0 
- Data_warehouse. If knowledge in Data Warehouse is required -> 1 , If not mentioned -> 0 
- cloud_storage. If knowledge in Cloud Storage is required -> 1 , If not mentioned -> 0 
- Google_Cloud. If knowledge in Google Cloud is required -> 1 , If not mentioned -> 0 
- AWS. If knowledge in AWS is required -> 1 , If not mentioned -> 0 
- Machine_Learning. If knowledge in Machine Learning is required -> 1 , If not mentioned -> 0 
- Deep Learning. If knowledge in Deep Learning is required -> 1 , If not entioned -> 0 
- Computer_vision. If knowledge in Computer Vision is required -> 1 , If not mentioned -> 0 
- Java. If Java is required -> 1 , If not mentioned -> 0 
- C++. If C++ is required -> 1 , If not mentioned -> 0 
- C. If C is required -> 1 , If not mentioned -> 0 
- Linux/Unix. If knowledge in Linux/Unix is required -> 1 , If not mentioned -> 0 
- SQL. If SQL is required -> 1 , If not mentioned -> 0 
- NoSQL. If NoSQL is required -> 1 , If not mentioned -> 0 
- RDBMS. If knowledge in RDBMS is required -> 1 , If not mentioned -> 0 
- Oracle. If knowledge in Oracle is required -> 1 , If not mentioned -> 0 
- MySQL. If MYSQL is required -> 1 , If not mentioned -> 0 
- PHP. If PHP is required -> 1 , If not mentioned -> 0 
- Flash_Actionscript. If knowledge in Flash Action Script is required -> 1 , If not mentioned -> 0 
- SPL. If knowledge in SPL is required -> 1 , If not mentioned -> 0 
- web_design_and_development_tools. If knowledge in Web Design and Development Tools is required -> 1 , If not mentioned -> 0 
- Wordpress. If knowledge in Wordpress is required -> 1 , If not mentioned -> 0 
- AI. If Artificial Intelligence is required -> 1 , If not mentioned -> 0 
- Natural_Language_Processing(NLP). If knowledge in NLP is required -> 1 , If not mentioned -> 0 
- Microsoft Power BI. If knowledge in Microsoft Power BI is required -> 1 , If not mentioned -> 0 
- Google_Analytics. If knowledge in Google Analytics is required -> 1 , If not mentioned -> 0 
- graphics_and_design_skills. If Graphic and Design Skills are required -> 1 , If not mentioned -> 0 
- Data_marketing. If Data Marketing abillity is required -> 1 , If not mentioned -> 0 
- SEO. If knowledge in SEO is required -> 1 , If not mentioned -> 0 
- Content_Management. If knowledge in Content Management is required -> 1 , If not mentioned -> 0 
- Tableau. If knowledge in Tableau is required -> 1 , If not mentioned -> 0 
- D3. If knowledge in D3 is required -> 1 , If not mentioned -> 0 
- Alteryx. If knowledge in Alteryx is required -> 1 , If not mentioned -> 0 
- KNIME. If knowledge in KNIME is required -> 1 , If not mentioned -> 0 
- Spotfire. If knowledge in Spotfire is required -> 1 , If not mentioned -> 0 
- Spark. If knowledge in Spark is required -> 1 , If not mentioned -> 0 
- S3. If knowledge in S3 is required -> 1 , If not mentioned -> 0 
- Redshift. If knowledge in Redshift is required -> 1 , If not mentioned -> 0 
- DigitalOcean. If knowledge in Digital Ocean is required -> 1 , If not mentioned -> 0 
- Javascript. If Java Script is required -> 1 , If not mentioned -> 0 
- Kafka. If knowledge in Kafka is required -> 1 , If not mentioned -> 0 
- Storm. If knowledge in Storm is required -> 1 , If not mentioned -> 0 
- Bash. If knowledge in Bash is required -> 1 , If not mentioned -> 0 
- Hadoop. If knowledge in Hadoop is required -> 1 , If not mentioned -> 0 
- Data_Pipelines. If knowledge in Data Pipelines is required -> 1 , If not mentioned -> 0 
- MPP_Platforms. If MPP Platforms is required ->1,If not mentioned-0 
- Qlik. If Qlik is required ->1,If not mentioned ->0 
- Pig. If Pig is required ->1,If not mentioned ->0 
- Hive. If Hive is required ->1,If not mentioned ->0 
- Tensorflow. If Tensorflow is required ->1,If not mentioned ->0 
- Map/Reduce. If Map/Reduce is required ->1,If not mentioned ->0 
- Impala. If Impala is required ->1,If not mentioned ->0 
- Solr. If Sloris required ->1,If not mentioned ->0 
- Teradata. If Teradata is required ->1,If not mentioned ->0 
- MongoDB. If MonoDB is required ->1,If not mentioned ->0 
- Elasticsearch. If Elasticsearch is required ->1,If not mentioned ->0 
- YOLO. If YOLO is required-1 ,If not mentioned-0 
- agile execution. If agile execution is required->1 ,If not mentioned->0 
- Data_management. If the knowledge in data management is required->1 ,If not mentioned->0 
- pyspark. If pyspark is required->1 ,If not mentioned->0 
- Data_mining. If the knowledge in data mining is required->1 ,If not mentioned->0 
- Data_science. If the knowledge in data science is required->1 ,If not mentioned->0 
- Web_Analytic_tools. If the knowledge in Web Analytic tools is required->1 ,If not mentioned->0 
- IOT. If IOT is required->1 ,If not mentioned->0 
- Numerical_Analysis. If the knowledge in Numerical Analysis is required->1 ,If not mentioned->0 
- Economic. If the knowledge in Economic is required->1 ,If not mentioned->0 
- Finance_Knowledge. If Finance_Knowledge is required->1 ,If not mentioned->0 
- Investment_Knowledge. If Investment Knowledge is required->1 ,If not mentioned->0 
- Problem_Solving. If the ability of Problem Solving is required->1 ,If not mentioned->0 
- Team_Handling. If the ability of Team Handling is required->1 ,If not mentioned->0 
- Debtor_reconcilation. If the ability of Debtor reconcilation is required->1 ,If not mentioned->0 
- Payroll_management. If Payroll management is required->1 ,If not mentioned->0 
- Bayesian. If Bayesian is required->1 ,If not mentioned->0 
- Optimization. If Optimization knowledge is required-1 ,If not mentioned-0 
- Knowledge_in. Required knowledge to do a particular job ,If not mentioned->NA 
- City. City where the company is located in 
- Educational_qualifications. Required educational qualifications 
- Salary. Amount of salary 
- URL. Web address of a particular job advertisement 
- Search_Term. web search term of a particular job advertisement 
- Job_Category. Category of the job (i.e. "Data Science","Data Analyst" etc.) 
- Team_Handling. If the ability of Team Handling is required-1 ,If not mentioned-0 
- Debtor_reconcilation. If the ability of Debtor reconciliation is required-1 ,If not mentioned-0 
- Payroll_management. If the ability of Payroll management is required-1 ,If not mentioned-0 
- Bayesian. If Bayesian knowledge is required-1 ,If not mentioned-0 
- Bahasa_Malaysia. If Bahasa Malaysia is required-1 ,If not mentioned-0 
- English_proficiency. If English proficiency is required-1 ,If not mentioned-0 
- Experience_Category. Number of years of experience in binned into categories 
- Location. Location 
- Payment Frequency. Payment frequency 
- BSc_needed. If BSc is required-1 ,If not mentioned-0 
- MSc_needed. If MSc is required-1 ,If not mentioned-0 
- PhD_needed. If PhD is required-1 ,If not mentioned-0 
- English Needed. If English is required-1 ,If not mentioned-0 
- year. Survey year 
Source
Data collection was done, BSc (Hons)Staistics, University of Sri Jayewardenepura under the statistical consultancy service from 2020 to 2023.
Data scientists, data Analyst, and statistician related job advertisements in 2020
Description
A dataset with 430 rows and 115 columns
Usage
data(DStidy_2020)
Details
- ID. Row id 
- Consultant. Name of the consultant 
- DateRetrieved. Date of data retrieved 
- DatePublished. Published date of the advertisement 
- Job_title. Name of the job category 
- Company. Name of the company 
- R. If R is required -> 1 , If not mentioned -> 0 
- SAS. If SAS is required -> 1 , If not mentioned -> 0 
- SPSS. If SPSS is required -> 1 , If not mentioned -> 0 
- Python. If Python is required -> 1 , If not mentioned -> 0 
- MAtlab. If MAtlab is required -> 1 , If not mentioned -> 0 
- Scala. If Scala is required -> 1 , If not mentioned -> 0 
- C_Sharp. If C_Sharp is required -> 1 , If not mentioned -> 0 
- Ms_Excel. If Ms_Excel is required -> 1 , If not mentioned -> 0 
- OLE_DB. If OLE_DB is required -> 1 , If not mentioned -> 0 
- Ms_Access. If Ms_Access is required -> 1 , If not mentioned -> 0 
- Ms_PowerPoint. If Ms_PowerPoint is required -> 1 , If not mentioned -> 0 
- Spreadsheets. If Spreadsheets is required -> 1 , If not mentioned -> 0 
- Data_visualization. If knowledge in Data Visualization is required -> 1 , If not mentioned -> 0 
- Presentation_Skills. If Presentation Skills are required -> 1 , If not mentioned -> 0 
- Communication. If Communication skills are required -> 1 , If not mentioned -> 0 
- BigData. If knowledge in Big Data analysis is required -> 1 , If not mentioned -> 0 
- Data_warehouse. If knowledge in Data Warehouse is required -> 1 , If not mentioned -> 0 
- cloud_storage. If knowledge in Cloud Storage is required -> 1 , If not mentioned -> 0 
- Google_Cloud. If knowledge in Google Cloud is required -> 1 , If not mentioned -> 0 
- AWS. If knowledge in AWS is required -> 1 , If not mentioned -> 0 
- Machine_Learning. If knowledge in Machine Learning is required -> 1 , If not mentioned -> 0 
- Deep_Learning. If knowledge in Deep Learning is required -> 1 , If not mentioned -> 0 
- Computer_vision. If knowledge in Computer Vision is required -> 1 , If not mentioned -> 0 
- Java. If Java is required -> 1 , If not mentioned -> 0 
- Cpp. If Cpp is required -> 1 , If not mentioned -> 0 
- C. If C is required -> 1 , If not mentioned -> 0 
- Linux_Unix. If knowledge in Linux/Unix is required -> 1 , If not mentioned -> 0 
- SQL. If SQL is required -> 1 , If not mentioned -> 0 
- NoSQL. If NoSQL is required -> 1 , If not mentioned -> 0 
- RDBMS. If knowledge in RDBMS is required -> 1 , If not mentioned -> 0 
- Oracle. If knowledge in Oracle is required -> 1 , If not mentioned -> 0 
- MySQL. If MYSQL is required -> 1 , If not mentioned -> 0 
- PHP. If PHP is required -> 1 , If not mentioned -> 0 
- Flash_Actionscript. If Flash_Actionscript is required -> 1 , If not mentioned -> 0 
- SPL. If knowledge in SPL is required -> 1 , If not mentioned -> 0 
- web_design_and_development_tools. If knowledge in Web Design and Development Tools is required -> 1 , If not mentioned -> 0 
- Wordpress. If Wordpress is required -> 1 , If not mentioned -> 0 
- AI. If AI is required 1 , If not mentioned 0 
- Natural_Language_Processing(NLP). If knowledge in NLP is required -> 1 , If not mentioned -> 0 
- Microsoft_Power_BI. If knowledge in Microsoft Power BI is required -> 1 , If not mentioned -> 0 
- Google_Analytics. If knowledge in Google Analytics is required -> 1 , If not mentioned -> 0 
- graphics_and_design_skills. If Graphic and Design Skills are required -> 1 , If not mentioned -> 0 
- Data_marketing. If Data Marketing abillity is required -> 1 , If not mentioned -> 0 
- SEO. If knowledge in SEO is required -> 1 , If not mentioned -> 0 
- Content_Management. If knowledge in Content Management is required -> 1 , If not mentioned -> 0 
- Tableau. If knowledge in Tableau is required -> 1 , If not mentioned -> 0 
- D3. If knowledge in D3 is required -> 1 , If not mentioned -> 0 
- Alteryx. If knowledge in Alteryx is required -> 1 , If not mentioned -> 0 
- KNIME. If knowledge in KNIME is required -> 1 , If not mentioned -> 0 
- Spotfire. If knowledge in Spotfire is required -> 1 , If not mentioned -> 0 
- Spark. If knowledge in Spark is required -> 1 , If not mentioned -> 0 
- S3. If knowledge in S3 is required -> 1 , If not mentioned -> 0 
- Redshift. If knowledge in Redshift is required -> 1 , If not mentioned -> 0 
- DigitalOcean. If knowledge in Digital Ocean is required -> 1 , If not mentioned -> 0 
- Javascript. If Java Script is required -> 1 , If not mentioned -> 0 
- Kafka. If knowledge in Kafka is required -> 1 , If not mentioned -> 0 
- Storm. If knowledge in Storm is required -> 1 , If not mentioned -> 0 
- Bash. If knowledge in Bash is required -> 1 , If not mentioned -> 0 
- Hadoop. If knowledge in Hadoop is required -> 1 , If not mentioned -> 0 
- Data_Pipelines. If knowledge in Data Pipelines is required -> 1 , If not mentioned -> 0 
- MPP_Platforms. If MPP Platforms is required -> 1 , If not mentioned -> 0 
- Qlik. If Qlik is required -> 1 , If not mentioned -> 0 
- Pig. If Pig is required -> 1 , If not mentioned -> 0 
- Hive. If Hive is required -> 1 , If not mentioned -> 0 
- Tensorflow. If Tensorflow is required -> 1 , If not mentioned -> 0 
- Map_Reduce. If Map/Reduce is required -> 1 , If not mentioned -> 0 
- Impala. If Impala is required -> 1 ,If not mentioned -> 0 
- Solr. If Sloris required -> 1 , If not mentioned -> 0 
- Teradata. If Teradata is required -> 1 , If not mentioned -> 0 
- MongoDB. If MonoDB is required -> 1 , If not mentioned -> 0 
- Elasticsearch. If Elasticsearch is required -> 1, If not mentioned -> 0 
- YOLO. If YOLO is required -> 1, If not mentioned -> 0 
- agile_execution. If agile execution is required -> 1 , If not mentioned -> 0 
- Data_management. If the knowledge in Data Management is required -> 1 , If not mentioned -> 0 
- pyspark. If pyspark is required -> 1 , If not mentioned -> 0 
- Data_mining. If the knowledge in Data Mining is required -> 1 , If not mentioned -> 0 
- Data_science. If the knowledge in Data Science is required -> 1 , If not mentioned -> 0 
- Web_Analytic_tools. If the knowledge in Web Analytic tools is required -> 1 , If not mentioned -> 0 
- IOT. If IOT is required -> 1 , If not mentioned -> 0 
- Numerical_Analysis. If the knowledge in Numerical Analysis is required -> 1 , If not mentioned -> 0 
- Economic. If the knowledge in Economic is required -> 1 , If not mentioned -> 0 
- Finance_Knowledge. If Finance_Knowledge is required -> 1 , If not mentioned -> 0 
- Investment_Knowledge. If Investment Knowledge is required -> 1 , If not mentioned -> 0 
- Problem_Solving. If the ability of Problem Solving is required -> 1 , If not mentioned -> 0 
- Korean_language. If the ability of Korean language is required -> 1 , If not mentioned -> 0 
- Bash_Linux_Scripting. If Bash Linux Scripting is required -> 1 , If not mentioned -> 0 
- Team_Handling. If the ability of Team Handling is required -> 1 , If not mentioned -> 0 
- Debtor_reconcilation. If the ability of Debtor reconciliation is required -> 1 , If not mentioned -> 0 
- Payroll_management. If the ability of Payroll management is required -> 1 , If not mentioned -> 0 
- Bayesian. If Bayesian knowledge is required -> 1 , If not mentioned -> 0 
- Optimization. If Optimization knowledge is required -> 1 ,If not mentioned -> 0 
- Bahasa_Malaysia. If Bahasa_Malaysia knowledge is required -> 1 ,If not mentioned -> 0 
- Knowledge_in. Required knowledge to do a particular job , If not mentioned -> NA 
- City. City where the company is located in , If not mentioned -> NA 
- Location. Country where the company is located in 
- Educational_qualifications. Required educational qualifications 
- Salary. Salary 
- English_proficiency. English proficiency 
- URL. URL of the job advertisement 
- Search_Term. Search Term 
- Job_Category. Name of the job category 
- Minimum_Years_of_experience. Minimum years of experience needed for the job , If not mentioned -> NA 
- Experience. Experience 
- Experience_Category. Experience category 
- Job_Country. Job country 
- Edu_Category. Education category 
- Minimum_Salary. Minimum salary 
- Salary_BasisSalary. basis 
Source
Data wrangling part was done by Janith C. Wanniarachchie, BSc (Hons)Staistics, University of Sri Jayewardenepura and description file was prepared by Randi Shashikala.
Data Scientists/Data Analyst/ Statistician Job Advertisements in the year 2021
Description
Job advertisements collected in the year 2021
Usage
DStidy_2021
Format
A data frame with 382 rows and 115 columns
- ID
- Row id 
- Consultant
- Name of the consultant 
- URL
- Web address of a particular job advertisement 
- Search_Term
- Web search term of a particular job advertisement 
- DateRetrieved
- Date of data retrieved 
- DatePublished
- Published date of the advertisement 
- Job_Field
- Name of the related job field 
- Job_title
- Name of the job category 
- Company
- Name of the company 
- Knowledge_in
- Required knowledge to do a particular job , If not mentioned -> NA 
- Minimum Experience in Years
- Minimum years of experience needed for the job , If not mentioned -> NA 
- City
- City where the company is located in , If not mentioned -> NA 
- Location
- Country where the company is located in 
- Educational_qualifications
- Required educational qualifications 
- Payment Frequency
- Payment basis of salary(i.e. "hourly","daily","monthly","yearly", "NA") 
- Currency
- Currency type of the salary 
- Salary
- Amount of salary 
- English Needed
- If English proficiency is required -> 1 , If not mentioned -> 0 
- English proficiency description
- Required level of English proficiency , If not mentioned -> NA 
- Additional_languages
- If other lanuages except English is required -> 1 , If not mentioned -> NA 
- AI
- If Artificial Intelligence is required -> 1 , If not mentioned -> 0 
- Natural_Language_Processing(NLP)
- If knowledge in NLP is required -> 1 , If not mentioned -> 0 
- Data_Pipelines
- If knowledge in Data Pipelines is required -> 1 , If not mentioned -> 0 
- Machine_Learning
- If knowledge in Machine Learning is required -> 1 , If not mentioned -> 0 
- Deep Learning
- If knowledge in Deep Learning is required -> 1 , If not mentioned -> 0 
- Computer_vision
- If knowledge in Computer Vision is required -> 1 , If not mentioned -> 0 
- Data_visualization
- If knowledge in Data Visualization is required -> 1 , If not mentioned -> 0 
- Data_warehouse
- If knowledge in Data Warehouse is required -> 1 , If not mentioned -> 0 
- BigData
- If knowledge in Big Data analysis is required -> 1 , If not mentioned -> 0 
- Data_management
- If the knowledge in Data Management is required -> 1 , If not mentioned -> 0 
- Data_mining
- If the knowledge in Data Mining is required -> 1 , If not mentioned -> 0 
- Data_science
- If the knowledge in Data Science is required -> 1 , If not mentioned -> 0 
- Bayesian
- If Bayesian knowledge is required -> 1 , If not mentioned -> 0 
- Optimization
- If Optimization knowledge is required -> 1 ,If not mentioned -> 0 
- Numerical_Analysis
- If the knowledge in Numerical Analysis is required -> 1 , If not mentioned -> 0 
- IOT
- If IOT is required -> 1 , If not mentioned -> 0 
- Data_translation
- If the knowledge in Data Translation is required -> 1 , If not mentioned -> 0 
- R
- If R is required -> 1 ,If not mentioned -> 0 
- SAS
- If SAS is required -> 1 , If not mentioned -> 0 
- SPSS
- If SPSS is required -> 1 , If not mentioned -> 0 
- Python
- If Python is required -> 1 , If not mentioned -> 0 
- MAtlab
- If Matlab is required -> 1 , If not mentioned -> 0 
- Scala
- If Scala is required -> 1 , If not mentioned -> 0 
- C#
- If C# is required -> 1 , If not mentioned -> 0 
- Java
- If Java is required -> 1 , If not mentioned -> 0 
- C++
- If C++ is required -> 1 , If not mentioned -> 0 
- C
- If C is required -> 1 , If not mentioned -> 0 
- Bash
- If Bash is required -> 1 , If not mentioned -> 0 
- Tensorflow
- If Tensorflow is required -> 1 , If not mentioned -> 0 
- pyspark
- If pyspark is required -> 1 , If not mentioned -> 0 
- YOLO
- If YOLO is required -> , If not mentioned -> 0 
- MS Word
- If knowledge in MS Word is required -> 1 , If not mentioned -> 0 
- Ms Excel
- If knowledge in MS Excel is required -> 1 , If not mentioned -> 0 
- Ms Access
- If Ms Access is required -> 1 , If not mentioned -> 0 
- Ms PowerPoint
- If knowledge in Ms Powerpoint is required -> 1 , If not mentioned -> 0 
- Spreadsheets
- If knowledge in Spreadsheets is required -> 1 , If not mentioned -> 0 
- Google_Analytics
- If knowledge in Google Analytics is required -> 1 , If not mentioned -> 0 
- Microsoft Power BI
- If knowledge in Microsoft Power BI is required -> 1 , If not mentioned -> 0 
- Tableau
- If knowledge in Tableau is required -> 1 , If not mentioned -> 0 
- D3
- If knowledge in D3 is required -> 1 , If not mentioned -> 0 
- Qlik
- If Qlik is required -> 1 , If not mentioned -> 0 
- KNIME
- If knowledge in KNIME is required -> 1 , If not mentioned -> 0 
- Spotfire
- If knowledge in Spotfire is required -> 1 , If not mentioned -> 0 
- Linux/Unix
- If knowledge in Linux/Unix is required -> 1 , If not mentioned -> 0 
- OLE/DB
- If knowledge in OLE/DB is required -> 1 , If not mentioned -> 0 
- SQL
- If SQL is required -> 1 , If not mentioned -> 0 
- NoSQL
- If NoSQL is required -> 1 , If not mentioned -> 0 
- RDBMS
- If knowledge in RDBMS is required -> 1 , If not mentioned -> 0 
- Oracle
- If knowledge in Oracle is required -> 1 , If not mentioned -> 0 
- MySQL
- If MYSQL is required -> 1 , If not mentioned -> 0 
- MongoDB
- If MonoDB is required -> 1 , If not mentioned -> 0 
- MPP_Platforms
- If MPP Platforms is required -> 1 , If not mentioned -> 0 
- SPL
- If knowledge in SPL is required -> 1 , If not mentioned -> 0 
- Alteryx
- If knowledge in Alteryx is required -> 1 , If not mentioned -> 0 
- Spark
- If knowledge in Spark is required -> 1 , If not mentioned -> 0 
- Kafka
- If knowledge in Kafka is required -> 1 , If not mentioned -> 0 
- Hadoop
- If knowledge in Hadoop is required -> 1 , If not mentioned -> 0 
- Pig
- If Pig is required -> 1 , If not mentioned -> 0 
- Hive
- If Hive is required -> 1 , If not mentioned -> 0 
- Map/Reduce
- If Map/Reduce is required -> 1 , If not mentioned -> 0 
- Impala
- If Impala is required -> 1 ,If not mentioned -> 0 
- Storm
- If knowledge in Storm is required -> 1 , If not mentioned -> 0 
- Google_Cloud
- If knowledge in Google Cloud is required -> 1 , If not mentioned -> 0 
- AWS
- If knowledge in AWS is required -> 1 , If not mentioned -> 0 
- cloud_storage
- If knowledge in Cloud Storage is required -> 1 , If not mentioned -> 0 
- S3
- If knowledge in S3 is required -> 1 , If not mentioned -> 0 
- Redshift
- If knowledge in Redshift is required -> 1 , If not mentioned -> 0 
- DigitalOcean
- If knowledge in Digital Ocean is required -> 1 , If not mentioned -> 0 
- Teradata
- If Teradata is required -> 1 , If not mentioned -> 0 
- Solr
- If Sloris required -> 1 , If not mentioned -> 0 
- Elasticsearch
- If Elasticsearch is required -> 1 , If not mentioned -> 0 
- Presentation_Skills
- If Presentation Skills are required -> 1 , If not mentioned -> 0 
- Communication
- If Communication skills are required -> 1 , If not mentioned -> 0 
- Problem_Solving
- If the ability of Problem Solving is required -> 1 , If not mentioned -> 0 
- Team_Handling
- If the ability of Team Handling is required -> 1 , If not mentioned -> 0 
- agile execution
- If agile execution is required -> 1 , If not mentioned -> 0 
- Data_marketing
- If Data Marketing abillity is required -> 1 , If not mentioned -> 0 
- SEO
- If knowledge in SEO is required -> 1 , If not mentioned -> 0 
- graphics_and_design_skills
- If Graphic and Design Skills are required -> 1 , If not mentioned -> 0 
- Content_Management
- If knowledge in Content Management is required -> 1 , If not mentioned -> 0 
- Economic
- If the knowledge in Economic is required -> 1 , If not mentioned -> 0 
- Finance_Knowledge
- If Finance_Knowledge is required -> 1 , If not mentioned -> 0 
- Investment_Knowledge
- If Investment Knowledge is required -> 1 , If not mentioned -> 0 
- Debtor_reconcilation
- If the ability of Debtor reconciliation is required -> 1 , If not mentioned -> 0 
- Payroll_management
- If the ability of Payroll management is required -> 1 , If not mentioned -> 0 
- web_design_and_development_tools
- If knowledge in Web Design and Development Tools is required -> 1 , If not mentioned -> 0 
- PHP
- If PHP is required -> 1 , If not mentioned -> 0 
- Javascript
- If Java Script is required -> 1 , If not mentioned -> 0 
- Web_Analytic_tools
- If the knowledge in Web Analytic tools is required -> 1 , If not mentioned -> 0 
- BSc_needed
- If a BSc Degree is required -> Yes , If not mentioned -> No/NA 
- MSc_needed
- If a MSc Degree is required -> Yes , If not mentioned -> No/NA 
- PhD_needed
- If a Phd Degree is required -> Yes , If not mentioned -> No/NA 
- Country
- Country 
- country_code
- country code 
- Job_Category
- Job category 
Source
Data wrangling part was done by Janith C. Wanniarachchie, BSc (Hons)Staistics, University of Sri Jayewardenepura and description file was prepared by Randi Shashikala.
Get data from DSjobtracker for specific years or all the years combined into one dataset
Description
The DSjobtracker dataset is updated each year through the Statistical Consultancy Service of University of Sri Jayewardenepura. In order to accommodate the structural changes of data this function provides the capability to get the dataset required either combined through out the years or data specific to each year.
Usage
get_data(year)
Arguments
| year | can be either "all" or an year after 2020 (2020,2021,...,etc.) as a numeric value |