In this project, we used the data available on the AI4Belgium website to build a dataset of the AI companies in the different regions of Belgium.
The goal of this short project is to facilitate job seekers (in particular, those who follow our AI training at BeCode.org) in finding AI companies close to their location where they could potentially apply for jobs or internships.
.
├── data/
│ ├── old_data/
│ ├── raw_scraped_dataset.csv
│ ├── clean_scraped_dataset.csv
│ └── geocoded_dataset.csv
├── scripts/
├── testing/
├── .gitignore
├── app.py
└── README.md
-
Clone the repository to your local machine.
-
Install requirements
pip install -r requirements.txt -
To run app locally, run in terminal:
streamlit run app.py
- First, we scrape company name, website, category, region, logo, and creation year from AI4Belgium using Selenium and save the data as a CSV file.
- Then, we find and scrape company addresses from CBE Public Search website.
- We use Geopy to geocode the locations using the OpenStreeMap API and visualize results in an interactive map using Plotly.
- Finally, we deploy on the web using Streamlit. You can navigate to the app using this link: https://ai-landscape-be.streamlit.app/.
Note: Companies with no available information in CBE website where excluded from the dataset.
- Some companies may have multiple locations which are unaccounted for.
- Implement multi-threading to speed up processing time
- Include start-up listed in other websites such as IMEC Start and DigitalWallonia
- Give the option for the user to check available employment opportunities
- Implement an LLM that can answer questions about the companies in the dataset
Let's connect: https://www.linkedin.com/in/vriveraq/