Scraping a Large Set of Products: Data Analysis
In this final page, we discuss the analysis and visualization of the data collected from the Mayesh online shop and outline the conclusions and future work.
7. Data Analysis
With the cleaned and processed data, we conducted various analyses to understand the trends and patterns in the product offerings. Here’s what we did:
- Price Distribution: We analyzed the distribution of prices to identify the range and common price points for different types of flowers.
- Popular Products: By aggregating user reviews and ratings, we identified the most popular products in the dataset.
- Category Analysis: We explored the number of products in each category to determine which types of flowers are most common.
8. Data Visualization
To better communicate our findings, we created several visualizations:
- Price Histogram: A histogram showing the distribution of product prices.
- Category Pie Chart: A pie chart representing the proportion of each flower category.
- Popularity Bar Chart: A bar chart displaying the most popular products based on user reviews.
9. Conclusion
Our project successfully scraped and analyzed over thousands of products from Mayesh's online catalog. Key insights include:
- The majority of products fall within the moderate price range, indicating a market focus on affordable options for bulk buyers.
- Roses and Tulips are the most common flowers, but exotic flowers like Orchids also have a significant presence.
- Popular products often have detailed descriptions and competitive pricing.
10. Future Work
To extend this project, we consider the following future work:
- Expand the Dataset: Include more products from additional categories and other seasons.
- Real-Time Analysis: Develop a real-time dashboard to track price changes and stock levels.
- Recommendation Engine: Use machine learning to recommend products to users based on their browsing behavior and preferences.