{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# OpenDataScience channel audience stats\n", "\n", "This notebook serves two main purposes:\n", "\n", "- Describes audience statistics of [Open Data Science](https://t.me/opendatascience) telegram channel.\n", "- Shows how Exploratory Data Analysis can be performed.\n", "\n", "We will try to present some common techniques to represent data, more over we will try to show how different types of plots or data manipulation can make a plot more interpretable.\n", "\n", "**Important**: This is the short and bried version of the main EDA notebook, which doesn't show in details how and what was done with data. Consider it as a dashboard. For learning something about code and exploratory analysis, check [more verbous version](research_eda.ipynb).\n", "\n", "This notebook is available at the [github](https://github.com/open-data-science/ods_channel_stats_eda) repo for corrections, addictions and edits. All pull requests are welcome.\n", "\n", "## Imports" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The autoreload extension is already loaded. To reload it, use:\n", " %reload_ext autoreload\n" ] } ], "source": [ "import sys\n", "from io import StringIO\n", "import datetime\n", "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "import plotly.express as px\n", "import plotly.graph_objects as go\n", "from plotly.subplots import make_subplots\n", "from wordcloud import WordCloud\n", "from collections import Counter\n", "\n", "%matplotlib inline\n", "plt.style.use('seaborn')\n", "from eda_utils import Eda\n", "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 01. Data preparation and general information" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
\n", " | Timestamp | \n", "Country | \n", "Timezone | \n", "Education | \n", "Work | \n", "Experience | \n", "Age | \n", "Sat_update | \n", "Sat_material | \n", "Interests | \n", "How_found | \n", "Recommend | \n", "Why | \n", "If you want to reach for the editors and to write something, please use the field below: | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "2020-01-29 13:30:57-03:00 | \n", "Ukraine | \n", "GMT+3 | \n", "Undergrad | \n", "Student + part time remote job | \n", "Middle | \n", "18-24 | \n", "Yes, it's about perfect | \n", "It's all ok | \n", "#CV #DL #imageprocessing #videolearning;#RL #D... | \n", "Forward from a friend | \n", "5 | \n", "All stuff is absolutely brilliant! Thank you f... | \n", "NaN | \n", "
1 | \n", "2020-01-29 13:31:19-03:00 | \n", "Russia | \n", "GMT+3 | \n", "Graduate | \n", "Employed | \n", "Middle | \n", "31-42 | \n", "Nope, less frequent posting will be all right ... | \n", "It's all ok | \n", "#RL #DL;#NLP #NLU #conversational #dialoguesys... | \n", "It's been so long time ago, I can't remember (... | \n", "4 | \n", "it's ok | \n", "post some jobs with salary ranges, especially ... | \n", "
2 | \n", "2020-01-29 13:32:48-03:00 | \n", "Ukraine | \n", "GMT+2 | \n", "PhD | \n", "Unemployed | \n", "Novice (Studying courses, active learning) | \n", "25-30 | \n", "Yes, it's about perfect | \n", "Need more specific and complicated materials | \n", "#WhereToStart #EntryLevel #Novice #MOOC #Learn... | \n", "Telegram channel search | \n", "3 | \n", "NaN | \n", "NaN | \n", "
3 | \n", "2020-01-29 13:33:27-03:00 | \n", "Italy | \n", "GMT+1 | \n", "No degree at all, still learning / self-taught | \n", "Student | \n", "Novice (Studying courses, active learning) | \n", "18-24 | \n", "Yes, it's about perfect | \n", "It's all ok | \n", "#CV #DL #imageprocessing #videolearning;#RL #D... | \n", "Forward from a friend | \n", "5 | \n", "Mainly due to material shared | \n", "NaN | \n", "
4 | \n", "2020-01-29 13:33:49-03:00 | \n", "Ukraine | \n", "GMT+2 | \n", "Graduate | \n", "Employed | \n", "Middle | \n", "18-24 | \n", "Yes, it's about perfect | \n", "It's all ok | \n", "#RL #DL;#NLP #NLU #conversational #dialoguesys... | \n", "It's been so long time ago, I can't remember (... | \n", "2 | \n", "It's not super useful actually. Good enough to... | \n", "NaN | \n", "