<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>SSPHub</title>
<link>https://ssphub-test.netlify.app/</link>
<atom:link href="https://ssphub-test.netlify.app/index.xml" rel="self" type="application/rss+xml"/>
<description>Network of data scientists working for the French administration</description>
<generator>quarto-1.9.37</generator>
<lastBuildDate>Tue, 17 Mar 2026 00:00:00 GMT</lastBuildDate>
<item>
  <title>sndsTools, a R package for extracting healthcare utilization in SNDS health data</title>
  <link>https://ssphub-test.netlify.app/project/2026_sndsTools/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level2">
<h2 class="anchored" data-anchor-id="project-summary">Project summary</h2>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th><code>sndsTools</code>, an R package for extracting healthcare utilization in SNDS health data</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td><code>sndsTools</code> is an R package designed to simplify the extraction of healthcare utilization from the Système National de Données de Santé (SNDS) health data hosted on the Health Insurance portal. It streamlines the data extraction steps from the SNDS for data used in the majority of SNDS studies.</td>
</tr>
<tr class="even">
<td><strong>Players</strong></td>
<td>Insee, Institut du Cerveau, Inria, AP-HM</td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>The R package <code>sndsTools</code> is in production and available for download from GitHub <i class="fa-brands fa-github" aria-label="github"></i>.</td>
</tr>
<tr class="even">
<td><strong>Products and project documentation</strong></td>
<td><a href="https://sndstoolers.github.io/sndsTools/index.html">R package documentation site</a></td>
</tr>
<tr class="odd">
<td><strong>Project code</strong></td>
<td>- The code is available on GitHub <i class="fa-brands fa-github" aria-label="github"></i> <a href="https://github.com/SNDStoolers/sndsTools">https://github.com/SNDStoolers/sndsTools</a></td>
</tr>
</tbody>
</table>
</section>
<section id="similar-projects" class="level2">
<h2 class="anchored" data-anchor-id="similar-projects">Similar projects</h2>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="cGFja2FnZSUyQ1IlMkNpbiUyMHByb2R1Y3Rpb24lMkNJbnNlZQ==" data-listing-date-sort="1672531200000" data-listing-file-modified-sort="1778082419540" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="200">
<a href="../../project/2023_doremifasol/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2023_doremifasol/doremifasol.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Doremifasol
</h5>
<div class="card-text listing-description delink">
<p>The package <i class="fa-brands fa-r-project" aria-label="r-project"></i> <code>Doremifasol</code> makes it easier for data scientists to retrieve Insee data. The library is open source.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2023
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="aW4lMjBwcm9kdWN0aW9uJTJDSW5zZWUlMkNwYWNrYWdlJTJDUHl0aG9u" data-listing-date-sort="1672531200000" data-listing-file-modified-sort="1778082419541" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="151">
<a href="../../project/2023_pynsee/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2023_pynsee/example_pynsee.webp" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
pynsee, a <i class="fa-brands fa-python" aria-label="python"></i> Python package for retrieving INSEE data
</h5>
<div class="card-text listing-description delink">
<p>The <i class="fa-brands fa-python" aria-label="python"></i> package <code>pynsee</code> package makes it easier for data scientists to retrieve INSEE data. The library is open source.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2023
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-categories="UHl0aG9uJTJDcGFja2FnZSUyQ2RlZXAlMjBsZWFybmluZyUyQ3NhdGVsbGl0ZSUyMGltYWdlcnk=" data-listing-date-sort="1664582400000" data-listing-file-modified-sort="1778082419539" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="454">
<a href="../../project/2022_satellites/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_satellites/Satellites_Mayotte.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Using satellite images for official statistics
</h5>
<div class="card-text listing-description delink">
<p>Using satellite images to improve population censuses in the French overseas territories</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Oct 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-categories="UHl0aG9uJTJDYXV0b21hdGljJTIwY29kaW5nJTJDcGFja2FnZSUyQ2luJTIwcHJvZHVjdGlvbiUyQ01MRmxvdw==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419531" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="287">
<a href="../../project/2022_codif_ape/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_codif_ape/codif_ape_overall.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Automatic coding of companies’ main activity
</h5>
<div class="card-text listing-description delink">
<p>Develop a machine learning algorithm to automate the classification of companies’ main activities and put it into production</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>
</div>



</section>

 ]]></description>
  <category>Insee</category>
  <category>package</category>
  <category>data extraction</category>
  <category>R</category>
  <category>administrative data</category>
  <guid>https://ssphub-test.netlify.app/project/2026_sndsTools/</guid>
  <pubDate>Tue, 17 Mar 2026 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2026_sndsTools/sndsTools_img.png" medium="image" type="image/png" height="132" width="144"/>
</item>
<item>
  <title>Comparison of forecasts between nowcasting and bottom-up approach</title>
  <link>https://ssphub-test.netlify.app/project/2025_nowcasting/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">
<h1>Project summary</h1>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>Comparison of continuous GDP forecasting methods with a bottom-up method</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>INSEE bases its GDP growth forecasts on a so-called bottom-up model, which involves reproducing the mechanism for constructing quarterly accounts. The question arises, however, as to the performance of such an approach to forecasting GDP compared with that of a direct method such as <em>nowcasting</em>. A direct approach makes it possible to forecast GDP growth directly from business cycle series, without using quarterly accounts. In order to study the relative performance of the two approaches, we developed a new bottom-up forecasting model for GDP growth in France, inspired by the Federal Reserve Bank of Atlanta’s “GDPNow” model.</td>
</tr>
<tr class="even">
<td><strong>Players</strong></td>
<td>Insee</td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>At the beginning of the quarter, the direct approach performs slightly better than the bottom-up approach: few quantitative indicators are available at that time, and the added value represented by the use of a complete accounting framework is limited compared with a direct model, which is naturally more parsimonious. From the end of the second month onwards, on the other hand, more quantitative indicators become available, and the bottom-up approach performs slightly better than the direct approach estimated in this study, particularly at the time of publication of the Business Review. Finally, the difference in performance between the two approaches is clear between the end of the quarter and the publication of the first estimates of the quarterly accounts thirty days later: during this period, it is much better to use the information available via a bottom-up approach rather than a direct approach.</td>
</tr>
<tr class="even">
<td><strong>Project products and documentation</strong></td>
<td>- <a href="https://www.insee.fr/en/statistiques/8642689?sommaire=8642697">How do the direct and bottom-up approaches compare in forecasting GDP for the current quarter?</a> INSEE Economic Outlook - September 2025</td>
</tr>
</tbody>
</table>
</section>
<section id="similar-projects" class="level1">
<h1>Similar projects</h1>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="SW5zZWUlMkNmb3JlY2FzdHMlMkNiYW5rJTIwYWNjb3VudCUyMGRhdGElMkNleHBlcmltZW50" data-listing-date-sort="1748736000000" data-listing-file-modified-sort="1778082419544" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="579">
<a href="../../project/2025_comptes_bancaires_conj/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2025_comptes_bancaires_conj/money.jpg" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="similar-projects">
Use of banking data for INSEE economic forecasts
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('YmFuayUyMGFjY291bnQlMjBkYXRh'); return false;">bank account data</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

</div>
<div class="card-text listing-description delink">
<!-- desc(5A0113B34292)[max=175]:project/2025_comptes_bancaires_conj/index.html -->
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jun 2025
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="bWFjaGluZSUyMGxlYXJuaW5nJTJDbm93Y2FzdGluZyUyQ2V4cGVyaW1lbnQlMjBzdG9wcGVkJTJDZm9yZWNhc3RzJTJDSW5zZWU=" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419513" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="306">
<a href="../../project/2019_gdp_tracker/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2019_gdp_tracker/evol_growth_en.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
GDP Tracker: a tool for continuous economic forecasting
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('bWFjaGluZSUyMGxlYXJuaW5n'); return false;">machine learning</div>

<div class="listing-category" onclick="window.quartoListingCategory('bm93Y2FzdGluZw=='); return false;">nowcasting</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudCUyMHN0b3BwZWQ='); return false;">experiment stopped</div>

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

</div>
<div class="card-text listing-description delink">
<p>Models of <em>machine learning</em> for real-time forecasting (<em>nowcasting</em>) to feed INSEE’s economic analyses</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-categories="SW5zZWUlMkNtYWNoaW5lJTIwbGVhcm5pbmclMkNleHBlcmltZW50JTJDZm9yZWNhc3RzJTJDd2Vic2NyYXBpbmc=" data-listing-date-sort="1614556800000" data-listing-file-modified-sort="1778082419521" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="267">
<a href="../../project/2021_gdp_media/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2021_gdp_media/gdp_media_en.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Predicting growth by reading the newspaper
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('bWFjaGluZSUyMGxlYXJuaW5n'); return false;">machine learning</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('d2Vic2NyYXBpbmc='); return false;">webscraping</div>

</div>
<div class="card-text listing-description delink">
<p>Use continuous press articles to build an indicator to help forecast growth</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Mar 2021
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-categories="Zm9yZWNhc3RzJTJDZXhwZXJpbWVudCUyQ0luc2VlJTJDY3JlZGl0JTIwY2FyZCUyMGRhdGElMkNtb2JpbGUlMjBwaG9uZSUyMGRhdGE=" data-listing-date-sort="1606780800000" data-listing-file-modified-sort="1778082419514" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="561">
<a href="../../project/2020_cb_conj/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2020_cb_conj/cb_conj.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Using credit card data and mobile phone data to forecast economic activity
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('Y3JlZGl0JTIwY2FyZCUyMGRhdGE='); return false;">credit card data</div>

<div class="listing-category" onclick="window.quartoListingCategory('bW9iaWxlJTIwcGhvbmUlMjBkYXRh'); return false;">mobile phone data</div>

</div>
<div class="card-text listing-description delink">
<p>The 2020 health crisis required a review of forecasting processes to be more responsive to events. INSEE used credit card transaction data to forecast economic activity.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Dec 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="4" data-categories="SW5zZWUlMkNmb3JlY2FzdHMlMkNwcml2YXRlJTIwZGF0YSUyQ2V4cGVyaW1lbnQ=" data-listing-date-sort="1606780800000" data-listing-file-modified-sort="1778082419515" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="302">
<a href="../../project/2020_electricite_conj/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2020_electricite_conj/electricity.jpg" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
What do the electricity production and consumption data say about economic activity during the containment period?
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('cHJpdmF0ZSUyMGRhdGE='); return false;">private data</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

</div>
<div class="card-text listing-description delink">
<p>Using electricity production and consumption data to forecast economic activity</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Dec 2020
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>
</div>



</section>

 ]]></description>
  <category>study</category>
  <category>nowcasting</category>
  <category>forecasts</category>
  <category>Insee</category>
  <guid>https://ssphub-test.netlify.app/project/2025_nowcasting/</guid>
  <pubDate>Mon, 01 Sep 2025 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2025_nowcasting/comp_en.png" medium="image" type="image/png" height="54" width="144"/>
</item>
<item>
  <title>Use of banking data for INSEE economic forecasts</title>
  <link>https://ssphub-test.netlify.app/project/2025_comptes_bancaires_conj/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">
<h1>Project summary</h1>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>Use of banking data for INSEE economic forecasts</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>Bank account data is an advanced source of information on household consumption and savings in 2020, at both microeconomic and infra-annual levels. Using anonymised data made available by Crédit Mutuel Alliance Fédérale, it is possible to study how the health crisis may have affected the financial situation of the bank’s household customers, depending on their income level, age or socio-professional category. <br> The banking data used does not allow us to identify household incomes directly, but we can nevertheless use it to deduce an approximation, thanks to all the transfers and cheques entering the accounts. These inflows fell during the first containment period before rebounding in June. On average, the second containment would not have caused a drop in these incoming flows.</td>
</tr>
<tr class="even">
<td><strong>Players</strong></td>
<td>Insee</td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>These studies are an extension of the work carried out on households using bank account data. <br> <br> During the two confinements of 2020, all the household groups studied, whatever their income level, are thought to have reduced their consumption, which has refocused, particularly in April, on essential goods. Households that consumed the most before the crisis, essentially executives and high-income earners, would therefore have reduced their consumption more. <br> The fall in consumption triggered a surge in savings, which fed into household current accounts and passbook accounts. Household gross financial assets (liquid savings, securities accounts and life insurance, excluding loans) are estimated to have risen sharply in 2020. This increase can be seen across all household groups, whatever their level of financial wealth. It is higher in euros for households with high financial wealth, who have been able to save more by reducing their consumption. Households with low wealth also put money aside, particularly during the first period of confinement. However, the amounts involved for these households, generally a few tens or hundreds of euros, remain small even though they represent a significant proportion of their initial wealth. Among working households, some would have been more affected than others by a fall in their income and would therefore have increased their savings less: this is the case for craftsmen and shopkeepers, or employees in the private sector, in contrast to those in the public sector. <br> <br> The analysis conducted for the June 2025 economic outlook shows that changes in income, consumption, and savings rate aggregates derived from these bank account data are consistent with those in the national accounts. While the average savings rate increased in almost all groups between 2023 and 2024, it was among the oldest that the increase in the savings rate in 2023 and 2024 was the strongest: people aged 65 and over accounted for around two-thirds of the increase in the savings rate between 2023 and 2024.</td>
</tr>
<tr class="even">
<td><strong>Project products and documentation</strong></td>
<td>- <a href="https://www.insee.fr/fr/statistiques/5232043">In 2020, the drop in consumption fuelled savings, with a particular increase in the financial wealth of the most well-off: some results obtained by analysing banking data</a> Insee Economic Outlook - March 2021 ; <br> - <a href="https://www.insee.fr/en/statistiques/8595742?sommaire=8595764">In 2024, in 2024, the income of retired customers of La Banque Postale rose sharply but their consumption did not keep pace, which contributed to two-thirds of the increase in the savings ratio</a> Insee Economic Outlook - June 2025</td>
</tr>
</tbody>
</table>
</section>
<section id="similar-projects" class="level1">
<h1>Similar projects</h1>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="ZXhwZXJpbWVudCUyQ21vYmlsZSUyMHBob25lJTIwZGF0YSUyQ2NyZWRpdCUyMGNhcmQlMjBkYXRhJTJDSW5zZWU=" data-listing-date-sort="1704067200000" data-listing-file-modified-sort="1778082419541" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="275">
<a href="../../project/2024_cb_mno_tabac/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2024_cb_mno_tabac/tabac.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="similar-projects">
An assessment of cross-border tobacco purchases and associated tax losses in France
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('bW9iaWxlJTIwcGhvbmUlMjBkYXRh'); return false;">mobile phone data</div>

<div class="listing-category" onclick="window.quartoListingCategory('Y3JlZGl0JTIwY2FyZCUyMGRhdGE='); return false;">credit card data</div>

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

</div>
<div class="card-text listing-description delink">
<p>Using the closing of borders in 2020 as a natural experiment to measure cross-border tobacco purchases</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2024
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="YXV0b21hdGljJTIwY29kaW5nJTJDZGF0YSUyMGV4dHJhY3Rpb24lMkNzY2FubmVyJTIwZGF0YQ==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419523" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="328">
<a href="../../project/2022_Enquete_Budget_Famille/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_Enquete_Budget_Famille/visuel_Budget_des_familles_1.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Methodological work on the Family Budget survey
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('YXV0b21hdGljJTIwY29kaW5n'); return false;">automatic coding</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZGF0YSUyMGV4dHJhY3Rpb24='); return false;">data extraction</div>

<div class="listing-category" onclick="window.quartoListingCategory('c2Nhbm5lciUyMGRhdGE='); return false;">scanner data</div>

</div>
<div class="card-text listing-description delink">
<p>Modernisation of the family budget survey using automatic classification tools</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-categories="Zm9yZWNhc3RzJTJDZXhwZXJpbWVudCUyQ0luc2VlJTJDY3JlZGl0JTIwY2FyZCUyMGRhdGElMkNtb2JpbGUlMjBwaG9uZSUyMGRhdGE=" data-listing-date-sort="1606780800000" data-listing-file-modified-sort="1778082419514" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="561">
<a href="../../project/2020_cb_conj/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2020_cb_conj/cb_conj.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Using credit card data and mobile phone data to forecast economic activity
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('Y3JlZGl0JTIwY2FyZCUyMGRhdGE='); return false;">credit card data</div>

<div class="listing-category" onclick="window.quartoListingCategory('bW9iaWxlJTIwcGhvbmUlMjBkYXRh'); return false;">mobile phone data</div>

</div>
<div class="card-text listing-description delink">
<p>The 2020 health crisis required a review of forecasting processes to be more responsive to events. INSEE used credit card transaction data to forecast economic activity.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Dec 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-categories="SW5zZWUlMkNmb3JlY2FzdHMlMkNwcml2YXRlJTIwZGF0YSUyQ2V4cGVyaW1lbnQ=" data-listing-date-sort="1606780800000" data-listing-file-modified-sort="1778082419515" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="302">
<a href="../../project/2020_electricite_conj/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2020_electricite_conj/electricity.jpg" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
What do the electricity production and consumption data say about economic activity during the containment period?
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('cHJpdmF0ZSUyMGRhdGE='); return false;">private data</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

</div>
<div class="card-text listing-description delink">
<p>Using electricity production and consumption data to forecast economic activity</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Dec 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="4" data-categories="ZGF0YXZpc3VhbGlzYXRpb24lMkNtYWNoaW5lJTIwbGVhcm5pbmclMkNJbnNlZSUyQ2V4cGVyaW1lbnQlMkNvcGVuLWRhdGElMkNtb2JpbGUlMjBwaG9uZSUyMGRhdGE=" data-listing-date-sort="1604188800000" data-listing-file-modified-sort="1778082419516" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="386">
<a href="../../project/2020_mvtpop/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2020_mvtpop/mvtpop.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Population movements around the March 2020 containment using mobile phone network operators data
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('ZGF0YXZpc3VhbGlzYXRpb24='); return false;">datavisualisation</div>

<div class="listing-category" onclick="window.quartoListingCategory('bWFjaGluZSUyMGxlYXJuaW5n'); return false;">machine learning</div>

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('b3Blbi1kYXRh'); return false;">open-data</div>

<div class="listing-category" onclick="window.quartoListingCategory('bW9iaWxlJTIwcGhvbmUlMjBkYXRh'); return false;">mobile phone data</div>

</div>
<div class="card-text listing-description delink">
<p>INSEE has had access to mobile telephony data as part of the monitoring of the 2020 health crisis. These data were used to produce the following statistics on population…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Nov 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="5" data-categories="UHl0aG9uJTJDYXV0b21hdGljJTIwY29kaW5nJTJDc2Nhbm5lciUyMGRhdGElMkNDT0lDT1AlMkNDUEklMkNpbiUyMHByb2R1Y3Rpb24=" data-listing-date-sort="1577836800000" data-listing-file-modified-sort="1778082419515" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="378">
<a href="../../project/2020_donnees_caisse/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2020_donnees_caisse/2020_donnees_caisse.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Classification of checkout data using machine learning
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('UHl0aG9u'); return false;">Python</div>

<div class="listing-category" onclick="window.quartoListingCategory('YXV0b21hdGljJTIwY29kaW5n'); return false;">automatic coding</div>

<div class="listing-category" onclick="window.quartoListingCategory('c2Nhbm5lciUyMGRhdGE='); return false;">scanner data</div>

<div class="listing-category" onclick="window.quartoListingCategory('Q09JQ09Q'); return false;">COICOP</div>

<div class="listing-category" onclick="window.quartoListingCategory('Q1BJ'); return false;">CPI</div>

<div class="listing-category" onclick="window.quartoListingCategory('aW4lMjBwcm9kdWN0aW9u'); return false;">in production</div>

</div>
<div class="card-text listing-description delink">
<p>Using machine learning to classify scanner data in the COICOP nomenclature to calculate the CPI</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="6" data-categories="SW5zZWUlMkNleHBlcmltZW50JTJDbW9iaWxlJTIwcGhvbmUlMjBkYXRhJTJDYWRtaW5pc3RyYXRpdmUlMjBkYXRh" data-listing-date-sort="1514764800000" data-listing-file-modified-sort="1778082419511" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="208">
<a href="../../project/2018_segregation/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" data-src="../../project/2018_segregation/indice_segregation.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Urban segregation: insights from mobile phone data
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('bW9iaWxlJTIwcGhvbmUlMjBkYXRh'); return false;">mobile phone data</div>

<div class="listing-category" onclick="window.quartoListingCategory('YWRtaW5pc3RyYXRpdmUlMjBkYXRh'); return false;">administrative data</div>

</div>
<div class="card-text listing-description delink">
<p>Merging administrative data and MNO data to estimate urban segregation at a local level</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2018
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>

</div>
</section>
<section id="other-studies-using-bank-data" class="level1">
<h1>Other studies using bank data</h1>
<p>Insee has also carried out other studies using bank account data. They are available on INSEE website:</p>
<div id="listing-autres-comptes-bancaires" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-listing-date-sort="1733011200000">
<a href="https://www.insee.fr/fr/information/8264558?sommaire=8264562" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2025_comptes_bancaires_conj/2024_courrier_stats.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="other-studies-using-bank-data">
The economy as told by banking data - What our account statements say about us (French only)
</h5>
<div class="card-text listing-description delink">
<p>Courrier des statistiques n°12, Insee, Décembre 2024</p>
</div>
<div class="card-attribution card-text-small end">
<div class="listing-date">
1 Dec 2024
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-listing-date-sort="1714521600000">
<a href="https://www.insee.fr/en/statistiques/8183838" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2025_comptes_bancaires_conj/2024_DT2024_08.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Cross-border shopping for fuel at the France-Germany border
</h5>
<div class="card-text listing-description delink">
<p>Insee working documents n°2024-08, mai 2024</p>
</div>
<div class="card-attribution card-text-small end">
<div class="listing-date">
1 May 2024
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-listing-date-sort="1701388800000">
<a href="https://www.insee.fr/en/statistiques/7734894" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2025_comptes_bancaires_conj/2023_InseeA90.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Household finance on a day‑to-day basis
</h5>
<div class="card-text listing-description delink">
<p>Insee Analyses n°90, December 2023</p>
</div>
<div class="card-attribution card-text-small end">
<div class="listing-date">
1 Dec 2023
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-listing-date-sort="1664582400000">
<a href="https://www.insee.fr/en/statistiques/6691801" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2025_comptes_bancaires_conj/2022_InseeA76.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Financial insecurity slightly up due to rising inflation, though lower in August 2022 than before the pandemic
</h5>
<div class="card-text listing-description delink">
<p>Insee Analyses n°76, October 2022</p>
</div>
<div class="card-attribution card-text-small end">
<div class="listing-date">
1 Oct 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="4" data-listing-date-sort="1664582400000">
<a href="https://journees-methodologie-statistique.insee.net/une-mesure-de-la-reponse-en-consommation-a-des-chocs-de-revenus-a-partir-des-donnees-bancaires/" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/assets/media/logo_Insee.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
A measurement of marginal propensity to consume after external shocks using bank account data
</h5>
<div class="card-text listing-description delink">
<p>Journées de méthodologie statistique 2022</p>
</div>
<div class="card-attribution card-text-small end">
<div class="listing-date">
1 Oct 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="5" data-listing-date-sort="1635724800000">
<a href="https://www.insee.fr/en/statistiques/6015053" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2025_comptes_bancaires_conj/2021_InseeA69.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Impact of the health crisis on an anonymised panel of La Banque Postale customers
</h5>
<div class="card-text listing-description delink">
<p>Insee Analyses n°69, November 2021</p>
</div>
<div class="card-attribution card-text-small end">
<div class="listing-date">
1 Nov 2021
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>
</div>



</section>

 ]]></description>
  <category>Insee</category>
  <category>forecasts</category>
  <category>bank account data</category>
  <category>experiment</category>
  <guid>https://ssphub-test.netlify.app/project/2025_comptes_bancaires_conj/</guid>
  <pubDate>Sun, 01 Jun 2025 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2025_comptes_bancaires_conj/money.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>An assessment of cross-border tobacco purchases and associated tax losses in France</title>
  <link>https://ssphub-test.netlify.app/project/2024_cb_mno_tabac/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">
<h1>Project summary</h1>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>An assessment of cross-border tobacco purchases and associated tax losses in France</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>Smoking is a major public health problem, causing many preventable diseases worldwide. In recent decades, increasing the price of tobacco has become the main strategy used by governments to reduce smoking. However, price differences between certain neighbouring countries are likely to limit the effectiveness of this measure by allowing some consumers to buy tobacco at a lower price in a neighbouring country. Although the problem of cross-border shopping is not new, the extent of the phenomenon remains poorly understood and is still the subject of regular debate. This study contributes to its assessment in France by exploiting an unprecedented natural experiment : the closure of land borders between March 2020 and June 2020 as part of the fight against the Covid-19 pandemic.</td>
</tr>
<tr class="even">
<td><strong>Players</strong></td>
<td>Insee</td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>Results show that the closure of the borders led to a 9.5 % surplus in tobacco purchases in mainland France compared to the counterfactual situation in which the borders had remained open. This result probably underestimates cross-border purchases. In fact, some tobacco consumption abroad may have continued during the first lockdown, as the borders were not completely closed, in particular for cross-border workers. Extrapolating the consumption observed in the rest of the country to border regions with identical characteristics, the revenue generated in France would be about 13.5 % higher if there were no cheaper alternatives abroad.</td>
</tr>
<tr class="even">
<td><strong>Project products and documentation</strong></td>
<td>- <a href="https://www.insee.fr/en/statistiques/8172204">An assessment of cross-border tobacco purchases and associated tax losses in France</a> Insee Working Papers No.&nbsp;2024-06, April 2024</td>
</tr>
</tbody>
</table>
</section>
<section id="similar-projects" class="level1">
<h1>Similar projects</h1>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="SW5zZWUlMkNmb3JlY2FzdHMlMkNiYW5rJTIwYWNjb3VudCUyMGRhdGElMkNleHBlcmltZW50" data-listing-date-sort="1748736000000" data-listing-file-modified-sort="1778082419544" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="579">
<a href="../../project/2025_comptes_bancaires_conj/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2025_comptes_bancaires_conj/money.jpg" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="similar-projects">
Use of banking data for INSEE economic forecasts
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('YmFuayUyMGFjY291bnQlMjBkYXRh'); return false;">bank account data</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

</div>
<div class="card-text listing-description delink">
<!-- desc(5A0113B34292)[max=175]:project/2025_comptes_bancaires_conj/index.html -->
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jun 2025
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="YXV0b21hdGljJTIwY29kaW5nJTJDZGF0YSUyMGV4dHJhY3Rpb24lMkNzY2FubmVyJTIwZGF0YQ==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419523" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="328">
<a href="../../project/2022_Enquete_Budget_Famille/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_Enquete_Budget_Famille/visuel_Budget_des_familles_1.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Methodological work on the Family Budget survey
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('YXV0b21hdGljJTIwY29kaW5n'); return false;">automatic coding</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZGF0YSUyMGV4dHJhY3Rpb24='); return false;">data extraction</div>

<div class="listing-category" onclick="window.quartoListingCategory('c2Nhbm5lciUyMGRhdGE='); return false;">scanner data</div>

</div>
<div class="card-text listing-description delink">
<p>Modernisation of the family budget survey using automatic classification tools</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-categories="Zm9yZWNhc3RzJTJDZXhwZXJpbWVudCUyQ0luc2VlJTJDY3JlZGl0JTIwY2FyZCUyMGRhdGElMkNtb2JpbGUlMjBwaG9uZSUyMGRhdGE=" data-listing-date-sort="1606780800000" data-listing-file-modified-sort="1778082419514" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="561">
<a href="../../project/2020_cb_conj/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2020_cb_conj/cb_conj.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Using credit card data and mobile phone data to forecast economic activity
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('Y3JlZGl0JTIwY2FyZCUyMGRhdGE='); return false;">credit card data</div>

<div class="listing-category" onclick="window.quartoListingCategory('bW9iaWxlJTIwcGhvbmUlMjBkYXRh'); return false;">mobile phone data</div>

</div>
<div class="card-text listing-description delink">
<p>The 2020 health crisis required a review of forecasting processes to be more responsive to events. INSEE used credit card transaction data to forecast economic activity.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Dec 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-categories="SW5zZWUlMkNmb3JlY2FzdHMlMkNwcml2YXRlJTIwZGF0YSUyQ2V4cGVyaW1lbnQ=" data-listing-date-sort="1606780800000" data-listing-file-modified-sort="1778082419515" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="302">
<a href="../../project/2020_electricite_conj/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2020_electricite_conj/electricity.jpg" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
What do the electricity production and consumption data say about economic activity during the containment period?
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('cHJpdmF0ZSUyMGRhdGE='); return false;">private data</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

</div>
<div class="card-text listing-description delink">
<p>Using electricity production and consumption data to forecast economic activity</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Dec 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="4" data-categories="ZGF0YXZpc3VhbGlzYXRpb24lMkNtYWNoaW5lJTIwbGVhcm5pbmclMkNJbnNlZSUyQ2V4cGVyaW1lbnQlMkNvcGVuLWRhdGElMkNtb2JpbGUlMjBwaG9uZSUyMGRhdGE=" data-listing-date-sort="1604188800000" data-listing-file-modified-sort="1778082419516" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="386">
<a href="../../project/2020_mvtpop/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2020_mvtpop/mvtpop.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Population movements around the March 2020 containment using mobile phone network operators data
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('ZGF0YXZpc3VhbGlzYXRpb24='); return false;">datavisualisation</div>

<div class="listing-category" onclick="window.quartoListingCategory('bWFjaGluZSUyMGxlYXJuaW5n'); return false;">machine learning</div>

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('b3Blbi1kYXRh'); return false;">open-data</div>

<div class="listing-category" onclick="window.quartoListingCategory('bW9iaWxlJTIwcGhvbmUlMjBkYXRh'); return false;">mobile phone data</div>

</div>
<div class="card-text listing-description delink">
<p>INSEE has had access to mobile telephony data as part of the monitoring of the 2020 health crisis. These data were used to produce the following statistics on population…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Nov 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="5" data-categories="UHl0aG9uJTJDYXV0b21hdGljJTIwY29kaW5nJTJDc2Nhbm5lciUyMGRhdGElMkNDT0lDT1AlMkNDUEklMkNpbiUyMHByb2R1Y3Rpb24=" data-listing-date-sort="1577836800000" data-listing-file-modified-sort="1778082419515" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="378">
<a href="../../project/2020_donnees_caisse/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2020_donnees_caisse/2020_donnees_caisse.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Classification of checkout data using machine learning
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('UHl0aG9u'); return false;">Python</div>

<div class="listing-category" onclick="window.quartoListingCategory('YXV0b21hdGljJTIwY29kaW5n'); return false;">automatic coding</div>

<div class="listing-category" onclick="window.quartoListingCategory('c2Nhbm5lciUyMGRhdGE='); return false;">scanner data</div>

<div class="listing-category" onclick="window.quartoListingCategory('Q09JQ09Q'); return false;">COICOP</div>

<div class="listing-category" onclick="window.quartoListingCategory('Q1BJ'); return false;">CPI</div>

<div class="listing-category" onclick="window.quartoListingCategory('aW4lMjBwcm9kdWN0aW9u'); return false;">in production</div>

</div>
<div class="card-text listing-description delink">
<p>Using machine learning to classify scanner data in the COICOP nomenclature to calculate the CPI</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="6" data-categories="SW5zZWUlMkNleHBlcmltZW50JTJDbW9iaWxlJTIwcGhvbmUlMjBkYXRhJTJDYWRtaW5pc3RyYXRpdmUlMjBkYXRh" data-listing-date-sort="1514764800000" data-listing-file-modified-sort="1778082419511" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="208">
<a href="../../project/2018_segregation/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" data-src="../../project/2018_segregation/indice_segregation.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Urban segregation: insights from mobile phone data
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('bW9iaWxlJTIwcGhvbmUlMjBkYXRh'); return false;">mobile phone data</div>

<div class="listing-category" onclick="window.quartoListingCategory('YWRtaW5pc3RyYXRpdmUlMjBkYXRh'); return false;">administrative data</div>

</div>
<div class="card-text listing-description delink">
<p>Merging administrative data and MNO data to estimate urban segregation at a local level</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2018
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>

</div>



</section>

 ]]></description>
  <category>experiment</category>
  <category>mobile phone data</category>
  <category>credit card data</category>
  <category>Insee</category>
  <guid>https://ssphub-test.netlify.app/project/2024_cb_mno_tabac/</guid>
  <pubDate>Mon, 01 Jan 2024 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2024_cb_mno_tabac/tabac.png" medium="image" type="image/png" height="143" width="144"/>
</item>
<item>
  <title>scanR, an application for monitoring France’s research and innovation landscape</title>
  <link>https://ssphub-test.netlify.app/project/2024_scanr/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">
<h1>Project summary</h1>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>Explore the world of French Research and Innovation with scanR</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>scanR is a <strong>web application to help characterise public structures</strong> (research units of all types, public institutions) and private structures (companies) involved in research and innovation in France. scanR also helps to identify the <strong>the direction of the work of researchers active in France</strong> since the early 1990s. <br> <br> scanR combines <strong>structured data under an open licence</strong> (publications and theses, participation in collaborative research projects, spin-offs, patents, etc.) and <strong>open information</strong> extracted directly from the websites of research and innovation players. This information comes from around 13 different sources (theses.fr, the <a href="../../project/2022_bso/index.html">open science barometer</a> HAL, European Commission, INPI, ANR, European Patent Office, etc.) and is obtained by webscraping, pdf-mining or using APIs. The resources are then <strong>identified and linked together</strong>, in particular using AI methods, and then enriched. <br> A <strong>feedback loop</strong> has been set to improve the quality of the data produced. It allows for requesting data corrections from the scanR site. <br> An <strong>engine based on ElasticSearch</strong> allows you to search for themes, structures or authors. <br> Finally, it is now possible to visually display interactions between different structures or themes using a <strong>network analysis</strong>.</td>
</tr>
<tr class="even">
<td><strong>Players</strong></td>
<td>Statistical Service of the Ministry of Higher Education and Research (<a href="https://www.enseignementsup-recherche.gouv.fr/fr/sies">SIES</a>)</td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>The scanR project has been in production since 2016 and records around 50,000 monthly visits. <br> <br> - The project provides <strong>files by structure and authors</strong> presenting their organisation, activity, areas of research and source of funding. <br> - It also offers a <strong>search engine</strong> on research structures, authors, funding, publications and patents in France. <br> - It also provides <strong>tools for analysing results</strong> including graph visualisations. <br> <br> The project also provides <strong>several APIs</strong> to retrieve data and data.</td>
</tr>
<tr class="even">
<td><strong>Project products and documentation</strong></td>
<td>scanR is available at <a href="https://scanr.enseignementsup-recherche.gouv.fr">scanr.enseignementsup-recherche.gouv.fr</a>. This site includes : <br> - the <a href="https://scanr.enseignementsup-recherche.gouv.fr/docs/overview">documentation</a> to access the four APIs available ; <br> - the various <a href="https://scanr.enseignementsup-recherche.gouv.fr/about/resources">data sources</a> used for the project. <br> <br> <strong>Methodology and technical details</strong> : <br> - Detailed technical presentation of scanR <a href="https://hal.science/hal-04813230v1"><em>scanR - Explore public data on French research and innovation</em></a> euroCRIS conference, November 2024; <br> - <a href="https://hal.science/hal-04892262v1">Mapping scientific communities at scale</a></td>
</tr>
<tr class="odd">
<td><strong>Project code</strong></td>
<td>- The code is available on GitHub <i class="fa-brands fa-github" aria-label="github"></i> <a href="https://github.com/dataesr/scanr-ui" class="uri">https://github.com/dataesr/scanr-ui</a></td>
</tr>
</tbody>
</table>
</section>
<section id="similar-projects" class="level1">
<h1>Similar projects</h1>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="d2Vic2NyYXBpbmclMkNpbiUyMHByb2R1Y3Rpb24lMkNkYXRhdmlzdWFsaXNhdGlvbiUyQ1NJRVM=" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419523" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="174">
<a href="../../project/2022_Curiexplore/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_Curiexplore/curiexplore.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="similar-projects">
Curiexplore, the platform for comparing national education and research policies
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('d2Vic2NyYXBpbmc='); return false;">webscraping</div>

<div class="listing-category" onclick="window.quartoListingCategory('aW4lMjBwcm9kdWN0aW9u'); return false;">in production</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZGF0YXZpc3VhbGlzYXRpb24='); return false;">datavisualisation</div>

<div class="listing-category" onclick="window.quartoListingCategory('U0lFUw=='); return false;">SIES</div>

</div>
<div class="card-text listing-description delink">
<p>Interactive visualisation of the teaching environment and research environment in different countries.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="d2Vic2NyYXBpbmclMkNkYXRhdmlzdWFsaXNhdGlvbiUyQ29wZW4tZGF0YSUyQ2luJTIwcHJvZHVjdGlvbiUyQ1NJRVM=" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419525" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="304">
<a href="../../project/2022_bso/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_bso/bso_en.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Open Science Monitor
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('d2Vic2NyYXBpbmc='); return false;">webscraping</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZGF0YXZpc3VhbGlzYXRpb24='); return false;">datavisualisation</div>

<div class="listing-category" onclick="window.quartoListingCategory('b3Blbi1kYXRh'); return false;">open-data</div>

<div class="listing-category" onclick="window.quartoListingCategory('aW4lMjBwcm9kdWN0aW9u'); return false;">in production</div>

<div class="listing-category" onclick="window.quartoListingCategory('U0lFUw=='); return false;">SIES</div>

</div>
<div class="card-text listing-description delink">
<p>To be able to monitor the opening up of scientific publications (the objective of the <strong>national plan for open science</strong>), the statistical service of the Ministry of Higher…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>
</div>



</section>

 ]]></description>
  <category>API</category>
  <category>in production</category>
  <category>open-data</category>
  <category>SIES</category>
  <category>datavisualisation</category>
  <category>ElasticSearch</category>
  <guid>https://ssphub-test.netlify.app/project/2024_scanr/</guid>
  <pubDate>Mon, 01 Jan 2024 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2024_scanr/scanr_en.png" medium="image" type="image/png" height="34" width="144"/>
</item>
<item>
  <title>Visualisations of SARS-Cov2 test data</title>
  <link>https://ssphub-test.netlify.app/project/2022_drees_dataviz/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">

<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>Automatic publication and data-visualisation of Covid-19 test data</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>In recent years, the <code>DREES</code> (the statistical department of the Ministry of Health), has been making available, as part of its strategy <em>open-data</em> as part of its open-data strategy, is making available interactive visualisations of data on a variety of topics (Covid, of course, but also the number of healthcare professionals, pensions and retirement ages, etc.). These visualisations are based on <i class="fa-brands fa-r-project" aria-label="r-project"></i> <code>Shiny</code> and give users the freedom to play with the data and update exploratory graphs. <br> Between October 2020 and June 2023,3 DREES provided data on the volume and validation times for RT-PCR and antigen tests on a weekly basis. These results are based on data from the SI-DEP information system.</td>
</tr>
<tr class="even">
<td><strong>Players</strong></td>
<td>DREES</td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>The project published automatically updated information on Covid-19 tests and deadlines between 2020 and mid-2023. The data is still available on the <a href="https://drees.shinyapps.io/delais_test_app/" class="uri">https://drees.shinyapps.io/delais_test_app/</a> but has not been updated since June 2023.</td>
</tr>
</tbody>
</table>


</section>

 ]]></description>
  <category>R</category>
  <category>datavisualisation</category>
  <category>experiment stopped</category>
  <category>DREES</category>
  <category>open-data</category>
  <guid>https://ssphub-test.netlify.app/project/2022_drees_dataviz/</guid>
  <pubDate>Thu, 01 Jun 2023 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2022_drees_dataviz/dataviz.png" medium="image" type="image/png" height="39" width="144"/>
</item>
<item>
  <title>Doremifasol</title>
  <link>https://ssphub-test.netlify.app/project/2023_doremifasol/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">
<h1>Project summary</h1>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>Package <i class="fa-brands fa-r-project" aria-label="r-project"></i> to retrieve data from the Insee website</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>doremifasol (data with R made available by Insee and easily retrievable in French) is a R package mainly aiming at showing off data available on Insee’s website, helping the user to put them on stage and extract the information they carry. So it is about analysing data, creating maps, quantifying phenomenons and in general using the data without the painful effort to retrieve them on the website, as well as import them into R’s memory. The name of the package stands for the five first notes of music, and pushing the metaphore, underlines its aim at helping the users to easily pratice their solfège in <i class="fa-brands fa-r-project" aria-label="r-project"></i>.</td>
</tr>
<tr class="even">
<td><strong>Players</strong></td>
<td>Insee</td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>The DoReMiFaSol package is open source and can be downloaded from GitHub. The documentation is also published <a href="https://inseefrlab.github.io/DoReMIFaSol/">here</a>.</td>
</tr>
<tr class="even">
<td><strong>Project code</strong></td>
<td>- Code is available on GitHub <i class="fa-brands fa-github" aria-label="github"></i> repo <a href="https://github.com/InseeFrLab/DoReMIFaSol">https://github.com/InseeFrLab/DoReMIFaSol</a></td>
</tr>
</tbody>
</table>
</section>
<section id="similar-projects" class="level1">
<h1>Similar projects</h1>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="SW5zZWUlMkNwYWNrYWdlJTJDZGF0YSUyMGV4dHJhY3Rpb24lMkNSJTJDYWRtaW5pc3RyYXRpdmUlMjBkYXRh" data-listing-date-sort="1773705600000" data-listing-file-modified-sort="1778082419546" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="217">
<a href="../../project/2026_sndsTools/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2026_sndsTools/sndsTools_img.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="similar-projects">
sndsTools, a R package for extracting healthcare utilization in SNDS health data
</h5>
<div class="card-text listing-description delink">
<p>The R package <code>sndsTools</code> facilitates the extraction of healthcare utilization from the Système National de Données de Santé (SNDS) health data hosted on the National Health…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
17 Mar 2026
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="aW4lMjBwcm9kdWN0aW9uJTJDSW5zZWUlMkNwYWNrYWdlJTJDUHl0aG9u" data-listing-date-sort="1672531200000" data-listing-file-modified-sort="1778082419541" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="151">
<a href="../../project/2023_pynsee/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2023_pynsee/example_pynsee.webp" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
pynsee, a <i class="fa-brands fa-python" aria-label="python"></i> Python package for retrieving INSEE data
</h5>
<div class="card-text listing-description delink">
<p>The <i class="fa-brands fa-python" aria-label="python"></i> package <code>pynsee</code> package makes it easier for data scientists to retrieve INSEE data. The library is open source.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2023
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-categories="UHl0aG9uJTJDcGFja2FnZSUyQ2RlZXAlMjBsZWFybmluZyUyQ3NhdGVsbGl0ZSUyMGltYWdlcnk=" data-listing-date-sort="1664582400000" data-listing-file-modified-sort="1778082419539" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="454">
<a href="../../project/2022_satellites/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_satellites/Satellites_Mayotte.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Using satellite images for official statistics
</h5>
<div class="card-text listing-description delink">
<p>Using satellite images to improve population censuses in the French overseas territories</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Oct 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-categories="UHl0aG9uJTJDYXV0b21hdGljJTIwY29kaW5nJTJDcGFja2FnZSUyQ2luJTIwcHJvZHVjdGlvbiUyQ01MRmxvdw==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419531" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="287">
<a href="../../project/2022_codif_ape/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_codif_ape/codif_ape_overall.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Automatic coding of companies’ main activity
</h5>
<div class="card-text listing-description delink">
<p>Develop a machine learning algorithm to automate the classification of companies’ main activities and put it into production</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>
</div>



</section>

 ]]></description>
  <category>package</category>
  <category>R</category>
  <category>in production</category>
  <category>Insee</category>
  <guid>https://ssphub-test.netlify.app/project/2023_doremifasol/</guid>
  <pubDate>Sun, 01 Jan 2023 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2023_doremifasol/doremifasol.png" medium="image" type="image/png" height="167" width="144"/>
</item>
<item>
  <title>pynsee, a  Python package for retrieving INSEE data</title>
  <link>https://ssphub-test.netlify.app/project/2023_pynsee/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">
<h1>Project summary</h1>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th><code>pynsee</code>, a <i class="fa-brands fa-python" aria-label="python"></i> Python package to retrieve data from INSEE</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>pynsee provides rapid access to more than 150,000 macroeconomic series, around ten local data sets, numerous sources available on insee.fr as well as key metadata and a SIRENE database containing data on all French businesses. Visit the <a href="portal-api.insee.fr">API details page</a>.<br> This package is a contribution to reproducible research and to the transparency of public data. It benefits from developments made by the teams working on APIs at INSEE and IGN.</td>
</tr>
<tr class="even">
<td><strong>Players</strong></td>
<td>Insee</td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>The pynsee package is in production and can be downloaded from GitHub. The documentation is also published <a href="https://pynsee.readthedocs.io/en/latest/">here</a>.</td>
</tr>
<tr class="even">
<td><strong>Project code</strong></td>
<td>- Repo <a href="https://github.com/InseeFrLab/pynsee">GitHub <i class="fa-brands fa-github" aria-label="github"></i></a></td>
</tr>
</tbody>
</table>
</section>
<section id="similar-projects-related-to-package-creation" class="level1">
<h1>Similar projects related to package creation</h1>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="SW5zZWUlMkNwYWNrYWdlJTJDZGF0YSUyMGV4dHJhY3Rpb24lMkNSJTJDYWRtaW5pc3RyYXRpdmUlMjBkYXRh" data-listing-date-sort="1773705600000" data-listing-file-modified-sort="1778082419546" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="217">
<a href="../../project/2026_sndsTools/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2026_sndsTools/sndsTools_img.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="similar-projects-related-to-package-creation">
sndsTools, a R package for extracting healthcare utilization in SNDS health data
</h5>
<div class="card-text listing-description delink">
<p>The R package <code>sndsTools</code> facilitates the extraction of healthcare utilization from the Système National de Données de Santé (SNDS) health data hosted on the National Health…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
17 Mar 2026
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="cGFja2FnZSUyQ1IlMkNpbiUyMHByb2R1Y3Rpb24lMkNJbnNlZQ==" data-listing-date-sort="1672531200000" data-listing-file-modified-sort="1778082419540" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="200">
<a href="../../project/2023_doremifasol/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2023_doremifasol/doremifasol.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Doremifasol
</h5>
<div class="card-text listing-description delink">
<p>The package <i class="fa-brands fa-r-project" aria-label="r-project"></i> <code>Doremifasol</code> makes it easier for data scientists to retrieve Insee data. The library is open source.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2023
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-categories="UHl0aG9uJTJDcGFja2FnZSUyQ2RlZXAlMjBsZWFybmluZyUyQ3NhdGVsbGl0ZSUyMGltYWdlcnk=" data-listing-date-sort="1664582400000" data-listing-file-modified-sort="1778082419539" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="454">
<a href="../../project/2022_satellites/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_satellites/Satellites_Mayotte.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Using satellite images for official statistics
</h5>
<div class="card-text listing-description delink">
<p>Using satellite images to improve population censuses in the French overseas territories</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Oct 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-categories="UHl0aG9uJTJDYXV0b21hdGljJTIwY29kaW5nJTJDcGFja2FnZSUyQ2luJTIwcHJvZHVjdGlvbiUyQ01MRmxvdw==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419531" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="287">
<a href="../../project/2022_codif_ape/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_codif_ape/codif_ape_overall.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Automatic coding of companies’ main activity
</h5>
<div class="card-text listing-description delink">
<p>Develop a machine learning algorithm to automate the classification of companies’ main activities and put it into production</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>
</div>



</section>

 ]]></description>
  <category>in production</category>
  <category>Insee</category>
  <category>package</category>
  <category>Python</category>
  <guid>https://ssphub-test.netlify.app/project/2023_pynsee/</guid>
  <pubDate>Sun, 01 Jan 2023 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2023_pynsee/example_pynsee.webp" medium="image" type="image/webp"/>
</item>
<item>
  <title>Using satellite images for official statistics</title>
  <link>https://ssphub-test.netlify.app/project/2022_satellites/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">
<h1>Project summary</h1>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>Using satellite images for official statistics</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>At the end of 2022, an experimental working group on the use of satellite data for official statistics was launched. The aim of this initiative was to <strong>improve the organisation of mapping surveys in overseas departments</strong> particularly in French Guiana and Mayotte. These surveys are used to update the individual/housing register (RIL) each year, and make it easier to identify new spontaneous or temporary housing areas in advance of the surveys. Algorithms applied to <strong>satellite photographs</strong> are used to identify such areas, which in a given year will require more surveyors than in the previous year. <br> The project is presented in more detail <a href="https://inseefrlab.github.io/satellite-images-docs/src/documentation/index.html">here</a>.</td>
</tr>
<tr class="even">
<td><strong>Actors</strong></td>
<td>Insee (DG and Réunion-Mayotte Interregional Directorate)</td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>The results of the project are presented in more detail <a href="https://inseefrlab.github.io/satellite-images-docs/">on the dedicated website</a>. The experimentation phase ended with: <br> - The transmission of <strong>building evolution statistics</strong> in Mayotte. These statistics were used to organise the 2024 complementary mapping survey in Mayotte; <br> - The development of a <a href="https://inseefrlab.github.io/satellite-images-webapp/"><strong>visualisation application</strong></a> integrating background maps built from Pleiades tiles and buildings detected on these tiles using our algorithms, as well as change detection layers; <br> - The creation of the<strong>Python package for processing satellite images</strong> <a href="https://pypi.org/project/astrovision/"><strong>Astrovision</strong></a></td>
</tr>
<tr class="even">
<td><strong>Project code</strong></td>
<td>- <a href="https://github.com/InseeFrLab/astrovision" class="uri">https://github.com/InseeFrLab/astrovision</a> : <strong>Python package</strong> package for working with satellite data: <br> - <a href="https://inseefrlab.github.io/satellite-images-webapp/" class="uri">https://inseefrlab.github.io/satellite-images-webapp/</a> code of the<strong>viewing application</strong> <br> - <a href="https://github.com/InseeFrLab/satellite-images-preprocess">https://github.com/InseeFrLab/satellite-images-preprocess</a> code of <strong>data pre-processing</strong> <br> - <a href="https://github.com/InseeFrLab/satellite-images-train" class="uri">https://github.com/InseeFrLab/satellite-images-train</a> code for the<strong>model training</strong> <br> - <a href="https://github.com/InseeFrLab/satellite-images-inference" class="uri">https://github.com/InseeFrLab/satellite-images-inference</a> code for the<strong>inference</strong></td>
</tr>
</tbody>
</table>
</section>
<section id="documents-relating-to-the-insee-satellite-data-project" class="level1">
<h1>Documents relating to the Insee satellite data project</h1>
<div id="listing-presentations" class="quarto-listing quarto-listing-container-custom">

<link href="../../assets/css/all.css" rel="stylesheet">

<div class="list grid quarto-listing-cols-3">

</div>

<div class="listing-no-matching d-none">No matching items</div>
</div>
</section>
<section id="similar-projects" class="level1">
<h1>Similar projects</h1>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="SW5zZWUlMkNwYWNrYWdlJTJDZGF0YSUyMGV4dHJhY3Rpb24lMkNSJTJDYWRtaW5pc3RyYXRpdmUlMjBkYXRh" data-listing-date-sort="1773705600000" data-listing-file-modified-sort="1778082419546" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="217">
<a href="../../project/2026_sndsTools/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2026_sndsTools/sndsTools_img.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="similar-projects">
sndsTools, a R package for extracting healthcare utilization in SNDS health data
</h5>
<div class="card-text listing-description delink">
<p>The R package <code>sndsTools</code> facilitates the extraction of healthcare utilization from the Système National de Données de Santé (SNDS) health data hosted on the National Health…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
17 Mar 2026
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="cGFja2FnZSUyQ1IlMkNpbiUyMHByb2R1Y3Rpb24lMkNJbnNlZQ==" data-listing-date-sort="1672531200000" data-listing-file-modified-sort="1778082419540" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="200">
<a href="../../project/2023_doremifasol/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2023_doremifasol/doremifasol.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Doremifasol
</h5>
<div class="card-text listing-description delink">
<p>The package <i class="fa-brands fa-r-project" aria-label="r-project"></i> <code>Doremifasol</code> makes it easier for data scientists to retrieve Insee data. The library is open source.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2023
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-categories="aW4lMjBwcm9kdWN0aW9uJTJDSW5zZWUlMkNwYWNrYWdlJTJDUHl0aG9u" data-listing-date-sort="1672531200000" data-listing-file-modified-sort="1778082419541" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="151">
<a href="../../project/2023_pynsee/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2023_pynsee/example_pynsee.webp" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
pynsee, a <i class="fa-brands fa-python" aria-label="python"></i> Python package for retrieving INSEE data
</h5>
<div class="card-text listing-description delink">
<p>The <i class="fa-brands fa-python" aria-label="python"></i> package <code>pynsee</code> package makes it easier for data scientists to retrieve INSEE data. The library is open source.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2023
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-categories="UHl0aG9uJTJDYXV0b21hdGljJTIwY29kaW5nJTJDcGFja2FnZSUyQ2luJTIwcHJvZHVjdGlvbiUyQ01MRmxvdw==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419531" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="287">
<a href="../../project/2022_codif_ape/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_codif_ape/codif_ape_overall.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Automatic coding of companies’ main activity
</h5>
<div class="card-text listing-description delink">
<p>Develop a machine learning algorithm to automate the classification of companies’ main activities and put it into production</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>
</div>



</section>

 ]]></description>
  <category>Python</category>
  <category>package</category>
  <category>deep learning</category>
  <category>satellite imagery</category>
  <guid>https://ssphub-test.netlify.app/project/2022_satellites/</guid>
  <pubDate>Sat, 01 Oct 2022 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2022_satellites/Satellites_Mayotte.png" medium="image" type="image/png" height="187" width="144"/>
</item>
<item>
  <title>GDP Tracker: a tool for continuous economic forecasting</title>
  <link>https://ssphub-test.netlify.app/project/2019_gdp_tracker/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">
<h1>Project summary</h1>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>Nowcasting GDP with GDP Tracker</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Details of the project</strong></td>
<td>The project GDP Tracker consists of building a continuous economic forecasting tool, fed by all the recent economic indicators available at any given time. <em>t</em>. The aim of this project is to make the best possible use of the most recent sources to estimate the dynamics of the major macroeconomic aggregates over the last few weeks. These sources are diverse in nature: business and consumer surveys, quarterly national accounts, hard-copy statistical data, as well as data from less traditional sources (internet searches, etc.). <br> <br> Using this heterogeneous set of data, the tool produces a growth forecast for the macroeconomic aggregates that are most closely scrutinised for French economic conditions (GDP, household consumption, investment, etc.). This forecast is made for the current quarter or the following quarters, based on models of <em>machine learning</em> (LASSO, random forests, etc.). <br> <br> The tool was initially developed for a <a href="https://www.insee.fr/en/statistiques/fichier/4269288/122019_dossier1E.pdf">Business conditions in 2019</a>. Following the health crisis, it was taken up again for more systematic use. Initially focused on France and only on GDP for the current quarter, it has been extended to other countries (Germany, Italy, Spain) and is currently being extended to aggregates other than GDP and to more distant forecasting horizons.</td>
</tr>
<tr class="even">
<td><strong>Stakeholders</strong></td>
<td>Insee</td>
</tr>
<tr class="odd">
<td><strong>Project products and documentation</strong></td>
<td>- <a href="https://www.insee.fr/en/statistiques/4269288?sommaire=4269398">Continuous forecasting of French growth</a><sup>1</sup>, Conjoncture in France - December 2019</td>
</tr>
</tbody>
</table>
</section>
<section id="similar-projects" class="level1">
<h1>Similar projects</h1>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="c3R1ZHklMkNub3djYXN0aW5nJTJDZm9yZWNhc3RzJTJDSW5zZWU=" data-listing-date-sort="1756684800000" data-listing-file-modified-sort="1778082419546" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="297">
<a href="../../project/2025_nowcasting/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2025_nowcasting/comp_en.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="similar-projects">
Comparison of forecasts between <em>nowcasting</em> and bottom-up approach
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('c3R1ZHk='); return false;">study</div>

<div class="listing-category" onclick="window.quartoListingCategory('bm93Y2FzdGluZw=='); return false;">nowcasting</div>

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

</div>
<div class="card-text listing-description delink">
<p>Use of real-time forecasting models (<em>nowcasting</em>) inspired by the Atlanta Federal Reserve’s “GDPnow” to forecast GDP growth and comparison with the bottom-up approach</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Sept 2025
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="SW5zZWUlMkNmb3JlY2FzdHMlMkNiYW5rJTIwYWNjb3VudCUyMGRhdGElMkNleHBlcmltZW50" data-listing-date-sort="1748736000000" data-listing-file-modified-sort="1778082419544" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="579">
<a href="../../project/2025_comptes_bancaires_conj/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2025_comptes_bancaires_conj/money.jpg" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Use of banking data for INSEE economic forecasts
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('YmFuayUyMGFjY291bnQlMjBkYXRh'); return false;">bank account data</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

</div>
<div class="card-text listing-description delink">
<!-- desc(5A0113B34292)[max=175]:project/2025_comptes_bancaires_conj/index.html -->
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jun 2025
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-categories="SW5zZWUlMkNtYWNoaW5lJTIwbGVhcm5pbmclMkNleHBlcmltZW50JTJDZm9yZWNhc3RzJTJDd2Vic2NyYXBpbmc=" data-listing-date-sort="1614556800000" data-listing-file-modified-sort="1778082419521" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="267">
<a href="../../project/2021_gdp_media/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2021_gdp_media/gdp_media_en.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Predicting growth by reading the newspaper
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('bWFjaGluZSUyMGxlYXJuaW5n'); return false;">machine learning</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('d2Vic2NyYXBpbmc='); return false;">webscraping</div>

</div>
<div class="card-text listing-description delink">
<p>Use continuous press articles to build an indicator to help forecast growth</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Mar 2021
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-categories="Zm9yZWNhc3RzJTJDZXhwZXJpbWVudCUyQ0luc2VlJTJDY3JlZGl0JTIwY2FyZCUyMGRhdGElMkNtb2JpbGUlMjBwaG9uZSUyMGRhdGE=" data-listing-date-sort="1606780800000" data-listing-file-modified-sort="1778082419514" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="561">
<a href="../../project/2020_cb_conj/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2020_cb_conj/cb_conj.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Using credit card data and mobile phone data to forecast economic activity
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('Y3JlZGl0JTIwY2FyZCUyMGRhdGE='); return false;">credit card data</div>

<div class="listing-category" onclick="window.quartoListingCategory('bW9iaWxlJTIwcGhvbmUlMjBkYXRh'); return false;">mobile phone data</div>

</div>
<div class="card-text listing-description delink">
<p>The 2020 health crisis required a review of forecasting processes to be more responsive to events. INSEE used credit card transaction data to forecast economic activity.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Dec 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="4" data-categories="SW5zZWUlMkNmb3JlY2FzdHMlMkNwcml2YXRlJTIwZGF0YSUyQ2V4cGVyaW1lbnQ=" data-listing-date-sort="1606780800000" data-listing-file-modified-sort="1778082419515" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="302">
<a href="../../project/2020_electricite_conj/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2020_electricite_conj/electricity.jpg" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
What do the electricity production and consumption data say about economic activity during the containment period?
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('cHJpdmF0ZSUyMGRhdGE='); return false;">private data</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

</div>
<div class="card-text listing-description delink">
<p>Using electricity production and consumption data to forecast economic activity</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Dec 2020
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>
</div>



</section>


<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>Regarding nowcasting, an experiment was also conducted in 2022 to impute unpublished variables and forecast growth, see presentation <a href="https://journees-methodologie-statistique.insee.net/nowcasting-pib-imputation-de-variables-non-encore-publiees/">GDP nowcasting: imputation of variables not yet published</a> at the 2022 Statistical Methodology Days (journées de méthodologie statistique).↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>machine learning</category>
  <category>nowcasting</category>
  <category>experiment stopped</category>
  <category>forecasts</category>
  <category>Insee</category>
  <guid>https://ssphub-test.netlify.app/project/2019_gdp_tracker/</guid>
  <pubDate>Sat, 01 Jan 2022 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2019_gdp_tracker/evol_growth_en.png" medium="image" type="image/png" height="59" width="144"/>
</item>
<item>
  <title>Curiexplore, the platform for comparing national education and research policies</title>
  <link>https://ssphub-test.netlify.app/project/2022_Curiexplore/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">
<h1>Project summary</h1>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>Curiexplore, the platform for comparing national education and research policies</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>The <code>CurieXplore</code> project, developed by <code>SIES</code> (the statistical department of the French Ministry of Higher Education and Research) offers an interactive visualisation of the teaching and research environment in different countries. <br> The platform combines textual content collected by SIES from the French diplomatic network and regularly webscrapped data from international statistical sources (UNESCO, World Bank, OECD, Eurostat, etc.) or private sources (ranking of HE institutions such as Shanghai, UMULTIRANK or THE).</td>
</tr>
<tr class="even">
<td><strong>Players</strong></td>
<td>Statistical Service of the Ministry of Higher Education and Research (<a href="https://www.enseignementsup-recherche.gouv.fr/fr/sies">SIES</a>)</td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>The project is now in production, with a dedicated website available at <a href="https://curiexplore.enseignementsup-recherche.gouv.fr/">curiexplore.enseignementsup-recherche.gouv.fr (in French)</a>.</td>
</tr>
<tr class="even">
<td><strong>Project code</strong></td>
<td>- The code is available on GitHub <i class="fa-brands fa-github" aria-label="github"></i> <a href="https://github.com/dataesr/curiexplore-ui">https://github.com/dataesr/curiexplore-ui</a>.</td>
</tr>
</tbody>
</table>
</section>
<section id="similar-projects" class="level1">
<h1>Similar projects</h1>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="d2Vic2NyYXBpbmclMkNpbiUyMHByb2R1Y3Rpb24lMkNhdXRvbWF0aWMlMjBjb2RpbmclMkNEQVJFUw==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419524" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="406">
<a href="../../project/2022_JOCAS/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_JOCAS/jocas.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="similar-projects">
Jocas, webscraping online job offers
</h5>
<div class="card-text listing-description delink">
<p>The project <code>Jocas</code> (Job offers collection and analysis system) project enables the DARES (Ministerial Statistical Office for Labour) to automatically collect job offers…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="d2Vic2NyYXBpbmclMkNkYXRhdmlzdWFsaXNhdGlvbiUyQ29wZW4tZGF0YSUyQ2luJTIwcHJvZHVjdGlvbiUyQ1NJRVM=" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419525" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="304">
<a href="../../project/2022_bso/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_bso/bso_en.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Open Science Monitor
</h5>
<div class="card-text listing-description delink">
<p>To be able to monitor the opening up of scientific publications (the objective of the <strong>national plan for open science</strong>), the statistical service of the Ministry of Higher…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-categories="aW4lMjBwcm9kdWN0aW9uJTJDd2Vic2NyYXBpbmclMkNDUEklMkNJbnNlZQ==" data-listing-date-sort="1622505600000" data-listing-file-modified-sort="1778082419522" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="244">
<a href="../../project/2021_webscraping_ipc_hotel/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2021_webscraping_ipc_hotel/webscraping_ipc.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Consumer price indices for hotel overnight stays: the experience of webscraping an online booking platform
</h5>
<div class="card-text listing-description delink">
<p>Exploring webscraping tools to fetch hotel overnight stays in CPI</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jun 2021
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-categories="SW5zZWUlMkNtYWNoaW5lJTIwbGVhcm5pbmclMkNleHBlcmltZW50JTJDZm9yZWNhc3RzJTJDd2Vic2NyYXBpbmc=" data-listing-date-sort="1614556800000" data-listing-file-modified-sort="1778082419521" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="267">
<a href="../../project/2021_gdp_media/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2021_gdp_media/gdp_media_en.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Predicting growth by reading the newspaper
</h5>
<div class="card-text listing-description delink">
<p>Use continuous press articles to build an indicator to help forecast growth</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Mar 2021
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="4" data-categories="SW5zZWUlMkNDUEklMkN3ZWJzY3JhcGluZyUyQ3JhbmRvbSUyMGZvcmVzdA==" data-listing-date-sort="1590969600000" data-listing-file-modified-sort="1778082419517" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="383">
<a href="../../project/2020_webscraping_ipc/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2020_webscraping_ipc/webscraping_ipc.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Webscrape product characteristics to improve inflation measurement
</h5>
<div class="card-text listing-description delink">
<p>Collect product characteristics on the web to improve the way quality effects are taken into account in the consumer price index.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jun 2020
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>
</div>



</section>

 ]]></description>
  <category>webscraping</category>
  <category>in production</category>
  <category>datavisualisation</category>
  <category>SIES</category>
  <guid>https://ssphub-test.netlify.app/project/2022_Curiexplore/</guid>
  <pubDate>Sat, 01 Jan 2022 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2022_Curiexplore/curiexplore.png" medium="image" type="image/png" height="114" width="144"/>
</item>
<item>
  <title>Methodological work on the Family Budget survey</title>
  <link>https://ssphub-test.netlify.app/project/2022_Enquete_Budget_Famille/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">
<h1>Project summary</h1>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>Methodological work on the Family Budget survey</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Details of the project</strong></td>
<td>The collection of the next Family Budget survey in 2026 will be very different from that of the previous survey (2017). <br> - In fact, some of the household consumption data will be collected through a <strong>mobile application</strong> made available to the households surveyed. <br> - In addition, the products derived from the consumption data will no longer be recorded in the database by the interviewers, but by a data entry or automatic extraction service provider. <br> There are two main strands to this experiment: <br> - The first involves testing open-source solutions from<strong>automatic extraction of the contents of till receipts</strong> in order to be able to work more effectively with partners such as Teklia and potentially carry out some of the extraction work in-house; <br> - The second (main) axis is around the <strong>automatic coding of consumer products in the COICOP nomenclature</strong> nomenclature: training of a classification model on data from the previous survey and development of an optimal recovery strategy.</td>
</tr>
<tr class="even">
<td><strong>Stakeholders</strong></td>
<td>Insee</td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>Project in progress</td>
</tr>
<tr class="even">
<td><strong>Project code</strong></td>
<td>- <a href="https://git.lab.sspcloud.fr/ssplab/bdf" class="uri">https://git.lab.sspcloud.fr/ssplab/bdf</a> <br> - <a href="https://git.lab.sspcloud.fr/ssplab/experimentation-bdf" class="uri">https://git.lab.sspcloud.fr/ssplab/experimentation-bdf</a></td>
</tr>
</tbody>
</table>
</section>
<section id="similar-projects" class="level1">
<h1>Similar projects</h1>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="d2Vic2NyYXBpbmclMkNpbiUyMHByb2R1Y3Rpb24lMkNhdXRvbWF0aWMlMjBjb2RpbmclMkNEQVJFUw==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419524" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="406">
<a href="../../project/2022_JOCAS/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_JOCAS/jocas.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="similar-projects">
Jocas, webscraping online job offers
</h5>
<div class="card-text listing-description delink">
<p>The project <code>Jocas</code> (Job offers collection and analysis system) project enables the DARES (Ministerial Statistical Office for Labour) to automatically collect job offers…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="UHl0aG9uJTJDYXV0b21hdGljJTIwY29kaW5nJTJDcGFja2FnZSUyQ2luJTIwcHJvZHVjdGlvbiUyQ01MRmxvdw==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419531" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="287">
<a href="../../project/2022_codif_ape/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_codif_ape/codif_ape_overall.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Automatic coding of companies’ main activity
</h5>
<div class="card-text listing-description delink">
<p>Develop a machine learning algorithm to automate the classification of companies’ main activities and put it into production</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-categories="UHJvamVjdHMlMkNQeXRob24lMkNhdXRvbWF0aWMlMjBjb2Rpbmc=" data-listing-date-sort="1609459200000" data-listing-file-modified-sort="1778082419519" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="353">
<a href="../../project/2021_codif_PCS/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2021_codif_PCS/2021_codif_PCS.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Automatic coding of occupations in the PCS 2020 nomenclature
</h5>
<div class="card-text listing-description delink">
<p>Automatically code occupations as part of the switch to the new PCS nomenclature (PCS 2020)</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2021
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-categories="UHl0aG9uJTJDYXV0b21hdGljJTIwY29kaW5nJTJDc2Nhbm5lciUyMGRhdGElMkNDT0lDT1AlMkNDUEklMkNpbiUyMHByb2R1Y3Rpb24=" data-listing-date-sort="1577836800000" data-listing-file-modified-sort="1778082419515" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="378">
<a href="../../project/2020_donnees_caisse/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2020_donnees_caisse/2020_donnees_caisse.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Classification of checkout data using machine learning
</h5>
<div class="card-text listing-description delink">
<p>Using machine learning to classify scanner data in the COICOP nomenclature to calculate the CPI</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="4" data-categories="aW4lMjBwcm9kdWN0aW9uJTJDSW5zZWUlMkNhdXRvbWF0aWMlMjBjb2RpbmclMkNtYWNoaW5lJTIwbGVhcm5pbmc=" data-listing-date-sort="1559347200000" data-listing-file-modified-sort="1778082419512" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="286">
<a href="../../project/2019_classification_asso/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2019_classification_asso/enquete_asso.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Automatic coding of association activity
</h5>
<div class="card-text listing-description delink">
<p>Automatic coding of association activity using machine learning methods</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jun 2019
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>
</div>



</section>

 ]]></description>
  <category>automatic coding</category>
  <category>data extraction</category>
  <category>scanner data</category>
  <guid>https://ssphub-test.netlify.app/project/2022_Enquete_Budget_Famille/</guid>
  <pubDate>Sat, 01 Jan 2022 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2022_Enquete_Budget_Famille/visuel_Budget_des_familles_1.png" medium="image" type="image/png" height="144" width="144"/>
</item>
<item>
  <title>Jocas, webscraping online job offers</title>
  <link>https://ssphub-test.netlify.app/project/2022_JOCAS/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">
<h1>Project summary</h1>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>Online job vacancies, a new source of labour market data</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>In just a few years, the Internet has become a new source of information on the job market. According to the Dares Job Offer and Recruitment survey (Ofer), 95% of job advertisements were published on the Internet in 2016, compared with 53% in 2005. With this in mind, Dares decided to collect online job offers published on around fifteen websites to create a new database of job offers: Jocas (Job offers collection and analysis system). Various tools are being used to build this new database: webscraping, automatic text classification algorithms and de-duplication. <br> For 2019, the Jocas database can be compared with the usual sources of public statistics on job vacancies, whether administrative sources, such as vacancies advertised by Pôle emploi and Declarations Préliminaires à l’embauche (DPAE) from Urssaf, or data from surveys such as Pôle emploi’s “Besoins en main-d’œuvre” (BMO), INSEE’s “Emploi” survey, and DARES’ “Activité et conditions d’emploi de la main-d’œuvre” (Acemo) survey. The results show that Jocas covers occupations unevenly. Occupations with a high proportion of managerial staff or that recruit a lot of people online tend to be over-represented. Conversely, those with a high proportion of multiple recruitments or using informal recruitment channels tend to be under-represented.</td>
</tr>
<tr class="even">
<td><strong>Players</strong></td>
<td><a href="https://dares.travail-emploi.gouv.fr/">DARES</a></td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>Online vacancy data has been used to calculate tensions on the labour market. They were also used to produce the table monitoring the labour market situation in 2020-2021 during the Covid-19 crisis. Jocas data is freely accessible to students, researchers and civil servants. Access to the data may also be granted for statistical and non-commercial use, on request from DARES. The database is accessible on the <a href="https://datalab.sspcloud.fr/">INSEE’s SSPCloud pl</a> by following the path ‘projet-jocas-prod/diffusion/JOCAS’.</td>
</tr>
<tr class="even">
<td><strong>Project products and documentation</strong></td>
<td>- <a href="https://dares.travail-emploi.gouv.fr/publication/les-offres-demploi-en-ligne-nouvelle-source-de-donnees-sur-le-marche-du-travail">Description on the DARES website</a> <br> - <a href="https://dares.travail-emploi.gouv.fr/sites/default/files/ed7db1ecec02433585a7654411102726/DE%20258%20Offres%20demploi%20en%20ligne.pdf">Working document</a> <br> - <a href="https://dares.travail-emploi.gouv.fr/actualite/un-hackathon-pour-dedoubler-les-offres-demploi-en-ligne">Hackathon in March 2023</a> on the duplication of job offers <br> - <a href="https://dares.travail-emploi.gouv.fr/actualite/suivre-les-offres-demploi-en-ligne-pour-mieux-comprendre-le-marche-du-travail">Project news</a> <br> - <a href="https://dares.travail-emploi.gouv.fr/enquete-source/job-offers-collection-and-analysis-system">Training</a> to use the JOCAS database</td>
</tr>
<tr class="odd">
<td><strong>Project code</strong></td>
<td>- Code repo is available on GitHub <i class="fa-brands fa-github" aria-label="github"></i> <a href="https://github.com/OnlineJobVacanciesESSnetBigData/JobTitleProcessing_FR">https://github.com/OnlineJobVacanciesESSnetBigData/JobTitleProcessing_FR</a></td>
</tr>
</tbody>
</table>
</section>
<section id="similar-projects" class="level1">
<h1>Similar projects</h1>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="d2Vic2NyYXBpbmclMkNpbiUyMHByb2R1Y3Rpb24lMkNkYXRhdmlzdWFsaXNhdGlvbiUyQ1NJRVM=" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419523" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="174">
<a href="../../project/2022_Curiexplore/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_Curiexplore/curiexplore.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="similar-projects">
Curiexplore, the platform for comparing national education and research policies
</h5>
<div class="card-text listing-description delink">
<p>Interactive visualisation of the teaching environment and research environment in different countries.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="YXV0b21hdGljJTIwY29kaW5nJTJDZGF0YSUyMGV4dHJhY3Rpb24lMkNzY2FubmVyJTIwZGF0YQ==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419523" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="328">
<a href="../../project/2022_Enquete_Budget_Famille/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_Enquete_Budget_Famille/visuel_Budget_des_familles_1.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Methodological work on the Family Budget survey
</h5>
<div class="card-text listing-description delink">
<p>Modernisation of the family budget survey using automatic classification tools</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-categories="d2Vic2NyYXBpbmclMkNkYXRhdmlzdWFsaXNhdGlvbiUyQ29wZW4tZGF0YSUyQ2luJTIwcHJvZHVjdGlvbiUyQ1NJRVM=" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419525" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="304">
<a href="../../project/2022_bso/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_bso/bso_en.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Open Science Monitor
</h5>
<div class="card-text listing-description delink">
<p>To be able to monitor the opening up of scientific publications (the objective of the <strong>national plan for open science</strong>), the statistical service of the Ministry of Higher…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-categories="UHl0aG9uJTJDYXV0b21hdGljJTIwY29kaW5nJTJDcGFja2FnZSUyQ2luJTIwcHJvZHVjdGlvbiUyQ01MRmxvdw==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419531" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="287">
<a href="../../project/2022_codif_ape/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_codif_ape/codif_ape_overall.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Automatic coding of companies’ main activity
</h5>
<div class="card-text listing-description delink">
<p>Develop a machine learning algorithm to automate the classification of companies’ main activities and put it into production</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="4" data-categories="aW4lMjBwcm9kdWN0aW9uJTJDd2Vic2NyYXBpbmclMkNDUEklMkNJbnNlZQ==" data-listing-date-sort="1622505600000" data-listing-file-modified-sort="1778082419522" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="244">
<a href="../../project/2021_webscraping_ipc_hotel/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2021_webscraping_ipc_hotel/webscraping_ipc.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Consumer price indices for hotel overnight stays: the experience of webscraping an online booking platform
</h5>
<div class="card-text listing-description delink">
<p>Exploring webscraping tools to fetch hotel overnight stays in CPI</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jun 2021
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="5" data-categories="SW5zZWUlMkNtYWNoaW5lJTIwbGVhcm5pbmclMkNleHBlcmltZW50JTJDZm9yZWNhc3RzJTJDd2Vic2NyYXBpbmc=" data-listing-date-sort="1614556800000" data-listing-file-modified-sort="1778082419521" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="267">
<a href="../../project/2021_gdp_media/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2021_gdp_media/gdp_media_en.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Predicting growth by reading the newspaper
</h5>
<div class="card-text listing-description delink">
<p>Use continuous press articles to build an indicator to help forecast growth</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Mar 2021
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="6" data-categories="UHJvamVjdHMlMkNQeXRob24lMkNhdXRvbWF0aWMlMjBjb2Rpbmc=" data-listing-date-sort="1609459200000" data-listing-file-modified-sort="1778082419519" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="353">
<a href="../../project/2021_codif_PCS/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" data-src="../../project/2021_codif_PCS/2021_codif_PCS.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Automatic coding of occupations in the PCS 2020 nomenclature
</h5>
<div class="card-text listing-description delink">
<p>Automatically code occupations as part of the switch to the new PCS nomenclature (PCS 2020)</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2021
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="7" data-categories="SW5zZWUlMkNDUEklMkN3ZWJzY3JhcGluZyUyQ3JhbmRvbSUyMGZvcmVzdA==" data-listing-date-sort="1590969600000" data-listing-file-modified-sort="1778082419517" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="383">
<a href="../../project/2020_webscraping_ipc/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" data-src="../../project/2020_webscraping_ipc/webscraping_ipc.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Webscrape product characteristics to improve inflation measurement
</h5>
<div class="card-text listing-description delink">
<p>Collect product characteristics on the web to improve the way quality effects are taken into account in the consumer price index.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jun 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="8" data-categories="UHl0aG9uJTJDYXV0b21hdGljJTIwY29kaW5nJTJDc2Nhbm5lciUyMGRhdGElMkNDT0lDT1AlMkNDUEklMkNpbiUyMHByb2R1Y3Rpb24=" data-listing-date-sort="1577836800000" data-listing-file-modified-sort="1778082419515" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="378">
<a href="../../project/2020_donnees_caisse/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" data-src="../../project/2020_donnees_caisse/2020_donnees_caisse.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Classification of checkout data using machine learning
</h5>
<div class="card-text listing-description delink">
<p>Using machine learning to classify scanner data in the COICOP nomenclature to calculate the CPI</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="9" data-categories="aW4lMjBwcm9kdWN0aW9uJTJDSW5zZWUlMkNhdXRvbWF0aWMlMjBjb2RpbmclMkNtYWNoaW5lJTIwbGVhcm5pbmc=" data-listing-date-sort="1559347200000" data-listing-file-modified-sort="1778082419512" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="286">
<a href="../../project/2019_classification_asso/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" data-src="../../project/2019_classification_asso/enquete_asso.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Automatic coding of association activity
</h5>
<div class="card-text listing-description delink">
<p>Automatic coding of association activity using machine learning methods</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jun 2019
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>

</div>



</section>

 ]]></description>
  <category>webscraping</category>
  <category>in production</category>
  <category>automatic coding</category>
  <category>DARES</category>
  <guid>https://ssphub-test.netlify.app/project/2022_JOCAS/</guid>
  <pubDate>Sat, 01 Jan 2022 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2022_JOCAS/jocas.png" medium="image" type="image/png" height="32" width="144"/>
</item>
<item>
  <title>Open Science Monitor</title>
  <link>https://ssphub-test.netlify.app/project/2022_bso/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">
<h1>Project summary</h1>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>French Open Science Monitor</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>The generalisation of open access to scientific publications is one of the axes of the French national open science strategy, with the objective of a 100% open access rate in 2030. It facilitates, broadens and accelerates the dissemination of the results of research to scientific communities and to society in general: teachers, students, companies, associations, public policy actors, etc. <br> <br> The French Open Science Monitor thus constructs open science indicators for publications, data, software and codes in order to measure the achievement of this objective. <br> As far as publications are concerned, as much metadata as possible is collected. It is then filtered to keep only French affiliations and enriched to track the open status of publications by identified discipline. The data is then published as open data. This method makes it possible to create the most exhaustive database of French publications in the world. <br> For data, software and codes, the scientific production is first listed. The text of the publication is then analysed to detect the availability of open access data, software or codes. Finally, monitoring indicators are produced and published.</td>
</tr>
<tr class="even">
<td><strong>Players</strong></td>
<td>Statistical Service of the Ministry of Higher Education and Research (<a href="https://www.enseignementsup-recherche.gouv.fr/fr/sies">SIES</a>)</td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>The project is presented on the dedicated website : <a href="https://frenchopensciencemonitor.esr.gouv.fr/" class="uri">https://frenchopensciencemonitor.esr.gouv.fr/</a>. It includes a <a href="https://frenchopensciencemonitor.esr.gouv.fr/about/methodology">methodology</a></td>
</tr>
<tr class="even">
<td><strong>Project code</strong></td>
<td>The code files are available on <a href="https://github.com/search?q=org%3Adataesr%20bso&amp;type=repositories">GitHub <i class="fa-brands fa-github" aria-label="github"></i></a></td>
</tr>
</tbody>
</table>
</section>
<section id="similar-projects-related-to-webscraping" class="level1">
<h1>Similar projects related to webscraping</h1>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="d2Vic2NyYXBpbmclMkNpbiUyMHByb2R1Y3Rpb24lMkNkYXRhdmlzdWFsaXNhdGlvbiUyQ1NJRVM=" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419523" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="174">
<a href="../../project/2022_Curiexplore/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_Curiexplore/curiexplore.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="similar-projects-related-to-webscraping">
Curiexplore, the platform for comparing national education and research policies
</h5>
<div class="card-text listing-description delink">
<p>Interactive visualisation of the teaching environment and research environment in different countries.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="d2Vic2NyYXBpbmclMkNpbiUyMHByb2R1Y3Rpb24lMkNhdXRvbWF0aWMlMjBjb2RpbmclMkNEQVJFUw==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419524" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="406">
<a href="../../project/2022_JOCAS/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_JOCAS/jocas.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Jocas, webscraping online job offers
</h5>
<div class="card-text listing-description delink">
<p>The project <code>Jocas</code> (Job offers collection and analysis system) project enables the DARES (Ministerial Statistical Office for Labour) to automatically collect job offers…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-categories="aW4lMjBwcm9kdWN0aW9uJTJDd2Vic2NyYXBpbmclMkNDUEklMkNJbnNlZQ==" data-listing-date-sort="1622505600000" data-listing-file-modified-sort="1778082419522" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="244">
<a href="../../project/2021_webscraping_ipc_hotel/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2021_webscraping_ipc_hotel/webscraping_ipc.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Consumer price indices for hotel overnight stays: the experience of webscraping an online booking platform
</h5>
<div class="card-text listing-description delink">
<p>Exploring webscraping tools to fetch hotel overnight stays in CPI</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jun 2021
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-categories="SW5zZWUlMkNtYWNoaW5lJTIwbGVhcm5pbmclMkNleHBlcmltZW50JTJDZm9yZWNhc3RzJTJDd2Vic2NyYXBpbmc=" data-listing-date-sort="1614556800000" data-listing-file-modified-sort="1778082419521" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="267">
<a href="../../project/2021_gdp_media/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2021_gdp_media/gdp_media_en.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Predicting growth by reading the newspaper
</h5>
<div class="card-text listing-description delink">
<p>Use continuous press articles to build an indicator to help forecast growth</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Mar 2021
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="4" data-categories="SW5zZWUlMkNDUEklMkN3ZWJzY3JhcGluZyUyQ3JhbmRvbSUyMGZvcmVzdA==" data-listing-date-sort="1590969600000" data-listing-file-modified-sort="1778082419517" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="383">
<a href="../../project/2020_webscraping_ipc/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2020_webscraping_ipc/webscraping_ipc.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Webscrape product characteristics to improve inflation measurement
</h5>
<div class="card-text listing-description delink">
<p>Collect product characteristics on the web to improve the way quality effects are taken into account in the consumer price index.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jun 2020
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>
</div>



</section>

 ]]></description>
  <category>webscraping</category>
  <category>datavisualisation</category>
  <category>open-data</category>
  <category>in production</category>
  <category>SIES</category>
  <guid>https://ssphub-test.netlify.app/project/2022_bso/</guid>
  <pubDate>Sat, 01 Jan 2022 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2022_bso/bso_en.png" medium="image" type="image/png" height="44" width="144"/>
</item>
<item>
  <title>Automatic coding of companies’ main activity</title>
  <link>https://ssphub-test.netlify.app/project/2022_codif_ape/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">
<h1>Project summary</h1>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>Automatic coding of companies’ main activity</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>The coding of companies’ main activity (APE) on the basis of activity descriptions (in the form of free text) in the Sirene register was previously carried out using 6 deterministic coding environments mobilising a huge number of decision rules. The aim of the experiment is to <strong>test the performance of statistical learning models in predicting the category of APE</strong> category as part of the overhaul of the Sirene register and the introduction of a one-stop shop. <br></td>
</tr>
<tr class="even">
<td><strong>Stakeholders</strong></td>
<td>Insee</td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>The model developed presents <strong>similar performance to previous models</strong> by automating them, and also offers decision support. The model has also been put into production, applying MLOps principles where possible. <br> Presentations and written materials relating to the project can be accessed at <a href="https://inseefrlab.github.io/codif-ape-prez/">this site</a>.</td>
</tr>
<tr class="even">
<td><strong>Project code</strong></td>
<td>Accessible code repositories on <a href="https://github.com/search?q=org%3AInseeFrLab%20codif-ape&amp;type=repositories&amp;p=1">Github <i class="fa-brands fa-github" aria-label="github"></i></a>. They include: <br> - Code for annotating data using Label Studio ; <br> - Code for a coding web API deployed on the SSP Cloud ; <br> - Code implementing a visualisation dashboard for monitoring the activity of a coding model in production and accessible via a web API; <br> - Code for training APE classification models. <br></td>
</tr>
</tbody>
</table>
</section>
<section id="project-documents" class="level1">
<h1>Project documents</h1>
<div id="listing-presentations" class="quarto-listing quarto-listing-container-custom">

<link href="../../assets/css/all.css" rel="stylesheet">

<div class="list grid quarto-listing-cols-3">

</div>

<div class="listing-no-matching d-none">No matching items</div>
</div>
</section>
<section id="similar-projects" class="level1">
<h1>Similar projects</h1>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="SW5zZWUlMkNwYWNrYWdlJTJDZGF0YSUyMGV4dHJhY3Rpb24lMkNSJTJDYWRtaW5pc3RyYXRpdmUlMjBkYXRh" data-listing-date-sort="1773705600000" data-listing-file-modified-sort="1778082419546" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="217">
<a href="../../project/2026_sndsTools/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2026_sndsTools/sndsTools_img.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="similar-projects">
sndsTools, a R package for extracting healthcare utilization in SNDS health data
</h5>
<div class="card-text listing-description delink">
<p>The R package <code>sndsTools</code> facilitates the extraction of healthcare utilization from the Système National de Données de Santé (SNDS) health data hosted on the National Health…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
17 Mar 2026
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="cGFja2FnZSUyQ1IlMkNpbiUyMHByb2R1Y3Rpb24lMkNJbnNlZQ==" data-listing-date-sort="1672531200000" data-listing-file-modified-sort="1778082419540" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="200">
<a href="../../project/2023_doremifasol/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2023_doremifasol/doremifasol.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Doremifasol
</h5>
<div class="card-text listing-description delink">
<p>The package <i class="fa-brands fa-r-project" aria-label="r-project"></i> <code>Doremifasol</code> makes it easier for data scientists to retrieve Insee data. The library is open source.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2023
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-categories="aW4lMjBwcm9kdWN0aW9uJTJDSW5zZWUlMkNwYWNrYWdlJTJDUHl0aG9u" data-listing-date-sort="1672531200000" data-listing-file-modified-sort="1778082419541" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="151">
<a href="../../project/2023_pynsee/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2023_pynsee/example_pynsee.webp" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
pynsee, a <i class="fa-brands fa-python" aria-label="python"></i> Python package for retrieving INSEE data
</h5>
<div class="card-text listing-description delink">
<p>The <i class="fa-brands fa-python" aria-label="python"></i> package <code>pynsee</code> package makes it easier for data scientists to retrieve INSEE data. The library is open source.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2023
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-categories="UHl0aG9uJTJDcGFja2FnZSUyQ2RlZXAlMjBsZWFybmluZyUyQ3NhdGVsbGl0ZSUyMGltYWdlcnk=" data-listing-date-sort="1664582400000" data-listing-file-modified-sort="1778082419539" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="454">
<a href="../../project/2022_satellites/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_satellites/Satellites_Mayotte.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Using satellite images for official statistics
</h5>
<div class="card-text listing-description delink">
<p>Using satellite images to improve population censuses in the French overseas territories</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Oct 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="4" data-categories="YXV0b21hdGljJTIwY29kaW5nJTJDZGF0YSUyMGV4dHJhY3Rpb24lMkNzY2FubmVyJTIwZGF0YQ==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419523" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="328">
<a href="../../project/2022_Enquete_Budget_Famille/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_Enquete_Budget_Famille/visuel_Budget_des_familles_1.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Methodological work on the Family Budget survey
</h5>
<div class="card-text listing-description delink">
<p>Modernisation of the family budget survey using automatic classification tools</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="5" data-categories="d2Vic2NyYXBpbmclMkNpbiUyMHByb2R1Y3Rpb24lMkNhdXRvbWF0aWMlMjBjb2RpbmclMkNEQVJFUw==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419524" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="406">
<a href="../../project/2022_JOCAS/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_JOCAS/jocas.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Jocas, webscraping online job offers
</h5>
<div class="card-text listing-description delink">
<p>The project <code>Jocas</code> (Job offers collection and analysis system) project enables the DARES (Ministerial Statistical Office for Labour) to automatically collect job offers…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="6" data-categories="UHJvamVjdHMlMkNQeXRob24lMkNhdXRvbWF0aWMlMjBjb2Rpbmc=" data-listing-date-sort="1609459200000" data-listing-file-modified-sort="1778082419519" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="353">
<a href="../../project/2021_codif_PCS/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" data-src="../../project/2021_codif_PCS/2021_codif_PCS.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Automatic coding of occupations in the PCS 2020 nomenclature
</h5>
<div class="card-text listing-description delink">
<p>Automatically code occupations as part of the switch to the new PCS nomenclature (PCS 2020)</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2021
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="7" data-categories="UHl0aG9uJTJDYXV0b21hdGljJTIwY29kaW5nJTJDc2Nhbm5lciUyMGRhdGElMkNDT0lDT1AlMkNDUEklMkNpbiUyMHByb2R1Y3Rpb24=" data-listing-date-sort="1577836800000" data-listing-file-modified-sort="1778082419515" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="378">
<a href="../../project/2020_donnees_caisse/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" data-src="../../project/2020_donnees_caisse/2020_donnees_caisse.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Classification of checkout data using machine learning
</h5>
<div class="card-text listing-description delink">
<p>Using machine learning to classify scanner data in the COICOP nomenclature to calculate the CPI</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="8" data-categories="aW4lMjBwcm9kdWN0aW9uJTJDSW5zZWUlMkNhdXRvbWF0aWMlMjBjb2RpbmclMkNtYWNoaW5lJTIwbGVhcm5pbmc=" data-listing-date-sort="1559347200000" data-listing-file-modified-sort="1778082419512" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="286">
<a href="../../project/2019_classification_asso/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" data-src="../../project/2019_classification_asso/enquete_asso.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Automatic coding of association activity
</h5>
<div class="card-text listing-description delink">
<p>Automatic coding of association activity using machine learning methods</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jun 2019
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>

</div>



</section>

 ]]></description>
  <category>Python</category>
  <category>automatic coding</category>
  <category>package</category>
  <category>in production</category>
  <category>MLFlow</category>
  <guid>https://ssphub-test.netlify.app/project/2022_codif_ape/</guid>
  <pubDate>Sat, 01 Jan 2022 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2022_codif_ape/codif_ape_overall.png" medium="image" type="image/png" height="82" width="144"/>
</item>
<item>
  <title>Detecting cybercrime in procedures</title>
  <link>https://ssphub-test.netlify.app/project/2021_detection_cyber/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">

<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>Detecting cybercrime in procedures</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>Cybercrime is a new and fast-growing form of criminality, and the fight against it is a major challenge for the police and gendarmerie. However, it is proving particularly difficult to quantify. It covers all criminal offences attempted or committed against or mainly using information and communication systems. The nomenclature of offences allows only part of this type of crime to be identified. To obtain a better statistical measure, it is necessary to use information provided in textual areas of procedures. The aim of the experiment is to explore the potential of textual analysis methods on these data to measure the extent of this type of crime.</td>
</tr>
<tr class="even">
<td><strong>Players</strong></td>
<td>INSEE and the Ministerial Statistical Office (MSO) for Internal Security (<a href="https://www.police-nationale.interieur.gouv.fr/nous-decouvrir/notre-organisation/organisation/service-statistique-ministeriel-de-securite">SSMSI</a>)</td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>The work was taken up by the SSMSI to derive a two-dimensional nomenclature of cyber-related offences. The algorithm predicts more than 95% of offences as falling within the definition of cybercrime. Similarly, it predicts less than 1% of offences as not falling within the scope of cybercrime as defined.</td>
</tr>
<tr class="even">
<td><strong>Project products and documentation</strong></td>
<td>- <a href="https://journees-methodologie-statistique.insee.net/detection-des-infractions-relevant-de-la-cyberdelinquance-a-partir-dune-analyse-textuelle-des-manieres-doperer/">Detection of cybercrime offences based on textual analysis of modus operandi</a>, 2022 Statistical Methodology Days (Journées de méthodologie statistique 2022)</td>
</tr>
</tbody>
</table>


</section>

 ]]></description>
  <category>in production</category>
  <category>SSMSI</category>
  <category>deep learning</category>
  <guid>https://ssphub-test.netlify.app/project/2021_detection_cyber/</guid>
  <pubDate>Tue, 01 Jun 2021 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2021_detection_cyber/cybercriminalite.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Modelling road vehicle fleet ownership and use</title>
  <link>https://ssphub-test.netlify.app/project/2021_vehicules/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">
<h1>Project summary</h1>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>Modelling road vehicle fleet ownership and use</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>In 2019, the Ministerial Statistical Office of the French Ministry of Ecological Transition (SDES) will be overhauling the statistical register of road vehicles (RSVero). The main innovation of this project is the use of technical inspection data to ensure that registered vehicles are still in circulation and to determine their annual use, thanks to the odometer reading taken at each visit. However, the integration of this data raises methodological issues: inspections take place on variable dates, sometimes with a delay, and we cannot be sure of the status of vehicles whose last inspection preceded the date on which we wish to determine the fleet. The project will enable us to estimate, for all the cars in the inventory, the probability that they are still in circulation on a given date and, where appropriate, the annual distance travelled. More broadly, this project is part of the drive to develop methodologies for processing administrative sources.</td>
</tr>
<tr class="even">
<td><strong>Players</strong></td>
<td>Insee, <a href="https://www.statistiques.developpement-durable.gouv.fr/english-contents">MSO of the French Ministry of Ecological Transition (SDES)</a></td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>The methodology is now in production and is used to determine the vehicles on the road each year, their characteristics and their use.</td>
</tr>
<tr class="even">
<td><strong>Project products and documentation</strong></td>
<td>- <a href="https://www.statistiques.developpement-durable.gouv.fr/media/7230/download?inline">Methodology for estimating vehicle fleets and distances travelled (abstract in English)</a> Working document, March 2024 <br> - <a href="https://journees-methodologie-statistique.insee.net/modelisation-de-lappartenance-au-parc-des-vehicules-routiers-et-de-son-utilisation/">Modelling road vehicle fleet ownership and use</a> Journées de méthodologie statistique 2022</td>
</tr>
<tr class="odd">
<td><strong>Project code</strong></td>
<td>- Repo <a href="">GitHub <i class="fa-brands fa-github" aria-label="github"></i></a></td>
</tr>
</tbody>
</table>
</section>
<section id="similar-projects" class="level1">
<h1>Similar projects</h1>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="SW5zZWUlMkNwYWNrYWdlJTJDZGF0YSUyMGV4dHJhY3Rpb24lMkNSJTJDYWRtaW5pc3RyYXRpdmUlMjBkYXRh" data-listing-date-sort="1773705600000" data-listing-file-modified-sort="1778082419546" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="217">
<a href="../../project/2026_sndsTools/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2026_sndsTools/sndsTools_img.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="similar-projects">
sndsTools, a R package for extracting healthcare utilization in SNDS health data
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('cGFja2FnZQ=='); return false;">package</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZGF0YSUyMGV4dHJhY3Rpb24='); return false;">data extraction</div>

<div class="listing-category" onclick="window.quartoListingCategory('Ug=='); return false;">R</div>

<div class="listing-category" onclick="window.quartoListingCategory('YWRtaW5pc3RyYXRpdmUlMjBkYXRh'); return false;">administrative data</div>

</div>
<div class="card-text listing-description delink">
<p>The R package <code>sndsTools</code> facilitates the extraction of healthcare utilization from the Système National de Données de Santé (SNDS) health data hosted on the National Health…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
17 Mar 2026
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="bWF0Y2hpbmclMkNhZG1pbmlzdHJhdGl2ZSUyMGRhdGElMkNpbiUyMHByb2R1Y3Rpb24=" data-listing-date-sort="1609459200000" data-listing-file-modified-sort="1778082419518" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="273">
<a href="../../project/2021_Appariement/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2021_Appariement/Resil.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Comparison of matching methods and the contribution of machine learning
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('bWF0Y2hpbmc='); return false;">matching</div>

<div class="listing-category" onclick="window.quartoListingCategory('YWRtaW5pc3RyYXRpdmUlMjBkYXRh'); return false;">administrative data</div>

<div class="listing-category" onclick="window.quartoListingCategory('aW4lMjBwcm9kdWN0aW9u'); return false;">in production</div>

</div>
<div class="card-text listing-description delink">
<p>To test and compare different matching methods in order to draw up recommendations for the work needed to build directories, particularly as part of the RESIL multiannual…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2021
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-categories="YWRtaW5pc3RyYXRpdmUlMjBkYXRhJTJDSW5zZWUlMkNtYWNoaW5lJTIwbGVhcm5pbmclMkNpbiUyMHByb2R1Y3Rpb24lMkNkYXRhJTIwZWRpdGluZw==" data-listing-date-sort="1514764800000" data-listing-file-modified-sort="1778082419511" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="251">
<a href="../../project/2018_outlier_dsn/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2018_outlier_dsn/dsn.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Detecting and processing outliers or missing values, application to the Déclaration Sociale Nominative (Social Nominative Declarations)
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('YWRtaW5pc3RyYXRpdmUlMjBkYXRh'); return false;">administrative data</div>

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('bWFjaGluZSUyMGxlYXJuaW5n'); return false;">machine learning</div>

<div class="listing-category" onclick="window.quartoListingCategory('aW4lMjBwcm9kdWN0aW9u'); return false;">in production</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZGF0YSUyMGVkaXRpbmc='); return false;">data editing</div>

</div>
<div class="card-text listing-description delink">
<p>Use of machine learning methods to detect and process outliers or missing values, application to the Social Nominative Declarations (Déclaration Sociale Nominative)</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2018
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-categories="SW5zZWUlMkNleHBlcmltZW50JTJDbW9iaWxlJTIwcGhvbmUlMjBkYXRhJTJDYWRtaW5pc3RyYXRpdmUlMjBkYXRh" data-listing-date-sort="1514764800000" data-listing-file-modified-sort="1778082419511" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="208">
<a href="../../project/2018_segregation/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2018_segregation/indice_segregation.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Urban segregation: insights from mobile phone data
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('bW9iaWxlJTIwcGhvbmUlMjBkYXRh'); return false;">mobile phone data</div>

<div class="listing-category" onclick="window.quartoListingCategory('YWRtaW5pc3RyYXRpdmUlMjBkYXRh'); return false;">administrative data</div>

</div>
<div class="card-text listing-description delink">
<p>Merging administrative data and MNO data to estimate urban segregation at a local level</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2018
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>
</div>



</section>

 ]]></description>
  <category>in production</category>
  <category>matching</category>
  <category>SDES</category>
  <category>administrative data</category>
  <guid>https://ssphub-test.netlify.app/project/2021_vehicules/</guid>
  <pubDate>Tue, 01 Jun 2021 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/assets/media/logo_ssm_insee.webp" medium="image" type="image/webp"/>
</item>
<item>
  <title>Consumer price indices for hotel overnight stays: the experience of webscraping an online booking platform</title>
  <link>https://ssphub-test.netlify.app/project/2021_webscraping_ipc_hotel/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">
<h1>Project summary</h1>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th><em>Webscraping</em> of the prices of hotel nights to construct the consumer price index</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>The consumer price index for overnight stays in hotels is calculated from data collected in the field by INSEE surveyors, who record the price of a room for one night on the same day for 2 people, including breakfast. In order to improve the index and overcome some of the limitations of the current method, this project is exploring an innovative collection method, the <em>webscraping</em> from a booking site. <br> Once the data has been collected online, it is raw and needs to be cleaned up: for example, the value for a characteristic is not necessarily described in the same way between two observations. To overcome the problems associated with a fixed basket index, the final index is constructed from homogeneous classes. Finally, the results of the index calculated using data from the online booking platform are compared with the published index.</td>
</tr>
<tr class="even">
<td><strong>Players</strong></td>
<td>Insee</td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>Since 2026, the hotel price index has been based half on field data and half on the booking platform.</td>
</tr>
<tr class="even">
<td><strong>Project products and documentation</strong></td>
<td>- <a href="https://journees-methodologie-statistique.insee.net/indices-des-prix-a-la-consommation-des-nuitees-hotelieres-lexperience-du-webscraping-dune-plateforme-de-reservation-en-ligne/">Consumer price indices for hotel overnight stays: the experience of webscraping an online booking platform</a>, 2022 Statistical Methodology Days (Journées de méthodologie statistique 2022)</td>
</tr>
</tbody>
</table>
</section>
<section id="similar-projects" class="level1">
<h1>Similar projects</h1>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="ZXhwZXJpbWVudCUyQ21vYmlsZSUyMHBob25lJTIwZGF0YSUyQ2NyZWRpdCUyMGNhcmQlMjBkYXRhJTJDSW5zZWU=" data-listing-date-sort="1704067200000" data-listing-file-modified-sort="1778082419541" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="275">
<a href="../../project/2024_cb_mno_tabac/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2024_cb_mno_tabac/tabac.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="similar-projects">
An assessment of cross-border tobacco purchases and associated tax losses in France
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('bW9iaWxlJTIwcGhvbmUlMjBkYXRh'); return false;">mobile phone data</div>

<div class="listing-category" onclick="window.quartoListingCategory('Y3JlZGl0JTIwY2FyZCUyMGRhdGE='); return false;">credit card data</div>

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

</div>
<div class="card-text listing-description delink">
<p>Using the closing of borders in 2020 as a natural experiment to measure cross-border tobacco purchases</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2024
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="d2Vic2NyYXBpbmclMkNpbiUyMHByb2R1Y3Rpb24lMkNkYXRhdmlzdWFsaXNhdGlvbiUyQ1NJRVM=" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419523" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="174">
<a href="../../project/2022_Curiexplore/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_Curiexplore/curiexplore.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Curiexplore, the platform for comparing national education and research policies
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('d2Vic2NyYXBpbmc='); return false;">webscraping</div>

<div class="listing-category" onclick="window.quartoListingCategory('aW4lMjBwcm9kdWN0aW9u'); return false;">in production</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZGF0YXZpc3VhbGlzYXRpb24='); return false;">datavisualisation</div>

<div class="listing-category" onclick="window.quartoListingCategory('U0lFUw=='); return false;">SIES</div>

</div>
<div class="card-text listing-description delink">
<p>Interactive visualisation of the teaching environment and research environment in different countries.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-categories="YXV0b21hdGljJTIwY29kaW5nJTJDZGF0YSUyMGV4dHJhY3Rpb24lMkNzY2FubmVyJTIwZGF0YQ==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419523" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="328">
<a href="../../project/2022_Enquete_Budget_Famille/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_Enquete_Budget_Famille/visuel_Budget_des_familles_1.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Methodological work on the Family Budget survey
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('YXV0b21hdGljJTIwY29kaW5n'); return false;">automatic coding</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZGF0YSUyMGV4dHJhY3Rpb24='); return false;">data extraction</div>

<div class="listing-category" onclick="window.quartoListingCategory('c2Nhbm5lciUyMGRhdGE='); return false;">scanner data</div>

</div>
<div class="card-text listing-description delink">
<p>Modernisation of the family budget survey using automatic classification tools</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-categories="d2Vic2NyYXBpbmclMkNpbiUyMHByb2R1Y3Rpb24lMkNhdXRvbWF0aWMlMjBjb2RpbmclMkNEQVJFUw==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419524" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="406">
<a href="../../project/2022_JOCAS/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_JOCAS/jocas.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Jocas, webscraping online job offers
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('d2Vic2NyYXBpbmc='); return false;">webscraping</div>

<div class="listing-category" onclick="window.quartoListingCategory('aW4lMjBwcm9kdWN0aW9u'); return false;">in production</div>

<div class="listing-category" onclick="window.quartoListingCategory('YXV0b21hdGljJTIwY29kaW5n'); return false;">automatic coding</div>

<div class="listing-category" onclick="window.quartoListingCategory('REFSRVM='); return false;">DARES</div>

</div>
<div class="card-text listing-description delink">
<p>The project <code>Jocas</code> (Job offers collection and analysis system) project enables the DARES (Ministerial Statistical Office for Labour) to automatically collect job offers…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="4" data-categories="d2Vic2NyYXBpbmclMkNkYXRhdmlzdWFsaXNhdGlvbiUyQ29wZW4tZGF0YSUyQ2luJTIwcHJvZHVjdGlvbiUyQ1NJRVM=" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419525" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="304">
<a href="../../project/2022_bso/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_bso/bso_en.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Open Science Monitor
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('d2Vic2NyYXBpbmc='); return false;">webscraping</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZGF0YXZpc3VhbGlzYXRpb24='); return false;">datavisualisation</div>

<div class="listing-category" onclick="window.quartoListingCategory('b3Blbi1kYXRh'); return false;">open-data</div>

<div class="listing-category" onclick="window.quartoListingCategory('aW4lMjBwcm9kdWN0aW9u'); return false;">in production</div>

<div class="listing-category" onclick="window.quartoListingCategory('U0lFUw=='); return false;">SIES</div>

</div>
<div class="card-text listing-description delink">
<p>To be able to monitor the opening up of scientific publications (the objective of the <strong>national plan for open science</strong>), the statistical service of the Ministry of Higher…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="5" data-categories="SW5zZWUlMkNtYWNoaW5lJTIwbGVhcm5pbmclMkNleHBlcmltZW50JTJDZm9yZWNhc3RzJTJDd2Vic2NyYXBpbmc=" data-listing-date-sort="1614556800000" data-listing-file-modified-sort="1778082419521" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="267">
<a href="../../project/2021_gdp_media/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2021_gdp_media/gdp_media_en.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Predicting growth by reading the newspaper
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('bWFjaGluZSUyMGxlYXJuaW5n'); return false;">machine learning</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('d2Vic2NyYXBpbmc='); return false;">webscraping</div>

</div>
<div class="card-text listing-description delink">
<p>Use continuous press articles to build an indicator to help forecast growth</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Mar 2021
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="6" data-categories="Zm9yZWNhc3RzJTJDZXhwZXJpbWVudCUyQ0luc2VlJTJDY3JlZGl0JTIwY2FyZCUyMGRhdGElMkNtb2JpbGUlMjBwaG9uZSUyMGRhdGE=" data-listing-date-sort="1606780800000" data-listing-file-modified-sort="1778082419514" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="561">
<a href="../../project/2020_cb_conj/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" data-src="../../project/2020_cb_conj/cb_conj.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Using credit card data and mobile phone data to forecast economic activity
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('Y3JlZGl0JTIwY2FyZCUyMGRhdGE='); return false;">credit card data</div>

<div class="listing-category" onclick="window.quartoListingCategory('bW9iaWxlJTIwcGhvbmUlMjBkYXRh'); return false;">mobile phone data</div>

</div>
<div class="card-text listing-description delink">
<p>The 2020 health crisis required a review of forecasting processes to be more responsive to events. INSEE used credit card transaction data to forecast economic activity.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Dec 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="7" data-categories="SW5zZWUlMkNmb3JlY2FzdHMlMkNwcml2YXRlJTIwZGF0YSUyQ2V4cGVyaW1lbnQ=" data-listing-date-sort="1606780800000" data-listing-file-modified-sort="1778082419515" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="302">
<a href="../../project/2020_electricite_conj/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" data-src="../../project/2020_electricite_conj/electricity.jpg" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
What do the electricity production and consumption data say about economic activity during the containment period?
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('Zm9yZWNhc3Rz'); return false;">forecasts</div>

<div class="listing-category" onclick="window.quartoListingCategory('cHJpdmF0ZSUyMGRhdGE='); return false;">private data</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

</div>
<div class="card-text listing-description delink">
<p>Using electricity production and consumption data to forecast economic activity</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Dec 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="8" data-categories="ZGF0YXZpc3VhbGlzYXRpb24lMkNtYWNoaW5lJTIwbGVhcm5pbmclMkNJbnNlZSUyQ2V4cGVyaW1lbnQlMkNvcGVuLWRhdGElMkNtb2JpbGUlMjBwaG9uZSUyMGRhdGE=" data-listing-date-sort="1604188800000" data-listing-file-modified-sort="1778082419516" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="386">
<a href="../../project/2020_mvtpop/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" data-src="../../project/2020_mvtpop/mvtpop.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Population movements around the March 2020 containment using mobile phone network operators data
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('ZGF0YXZpc3VhbGlzYXRpb24='); return false;">datavisualisation</div>

<div class="listing-category" onclick="window.quartoListingCategory('bWFjaGluZSUyMGxlYXJuaW5n'); return false;">machine learning</div>

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('b3Blbi1kYXRh'); return false;">open-data</div>

<div class="listing-category" onclick="window.quartoListingCategory('bW9iaWxlJTIwcGhvbmUlMjBkYXRh'); return false;">mobile phone data</div>

</div>
<div class="card-text listing-description delink">
<p>INSEE has had access to mobile telephony data as part of the monitoring of the 2020 health crisis. These data were used to produce the following statistics on population…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Nov 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="9" data-categories="SW5zZWUlMkNDUEklMkN3ZWJzY3JhcGluZyUyQ3JhbmRvbSUyMGZvcmVzdA==" data-listing-date-sort="1590969600000" data-listing-file-modified-sort="1778082419517" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="383">
<a href="../../project/2020_webscraping_ipc/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" data-src="../../project/2020_webscraping_ipc/webscraping_ipc.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Webscrape product characteristics to improve inflation measurement
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('Q1BJ'); return false;">CPI</div>

<div class="listing-category" onclick="window.quartoListingCategory('d2Vic2NyYXBpbmc='); return false;">webscraping</div>

<div class="listing-category" onclick="window.quartoListingCategory('cmFuZG9tJTIwZm9yZXN0'); return false;">random forest</div>

</div>
<div class="card-text listing-description delink">
<p>Collect product characteristics on the web to improve the way quality effects are taken into account in the consumer price index.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jun 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="10" data-categories="UHl0aG9uJTJDYXV0b21hdGljJTIwY29kaW5nJTJDc2Nhbm5lciUyMGRhdGElMkNDT0lDT1AlMkNDUEklMkNpbiUyMHByb2R1Y3Rpb24=" data-listing-date-sort="1577836800000" data-listing-file-modified-sort="1778082419515" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="378">
<a href="../../project/2020_donnees_caisse/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" data-src="../../project/2020_donnees_caisse/2020_donnees_caisse.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Classification of checkout data using machine learning
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('UHl0aG9u'); return false;">Python</div>

<div class="listing-category" onclick="window.quartoListingCategory('YXV0b21hdGljJTIwY29kaW5n'); return false;">automatic coding</div>

<div class="listing-category" onclick="window.quartoListingCategory('c2Nhbm5lciUyMGRhdGE='); return false;">scanner data</div>

<div class="listing-category" onclick="window.quartoListingCategory('Q09JQ09Q'); return false;">COICOP</div>

<div class="listing-category" onclick="window.quartoListingCategory('Q1BJ'); return false;">CPI</div>

<div class="listing-category" onclick="window.quartoListingCategory('aW4lMjBwcm9kdWN0aW9u'); return false;">in production</div>

</div>
<div class="card-text listing-description delink">
<p>Using machine learning to classify scanner data in the COICOP nomenclature to calculate the CPI</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2020
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="11" data-categories="SW5zZWUlMkNleHBlcmltZW50JTJDbW9iaWxlJTIwcGhvbmUlMjBkYXRhJTJDYWRtaW5pc3RyYXRpdmUlMjBkYXRh" data-listing-date-sort="1514764800000" data-listing-file-modified-sort="1778082419511" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="208">
<a href="../../project/2018_segregation/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" data-src="../../project/2018_segregation/indice_segregation.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Urban segregation: insights from mobile phone data
</h5>
<div class="listing-categories">

<div class="listing-category" onclick="window.quartoListingCategory('SW5zZWU='); return false;">Insee</div>

<div class="listing-category" onclick="window.quartoListingCategory('ZXhwZXJpbWVudA=='); return false;">experiment</div>

<div class="listing-category" onclick="window.quartoListingCategory('bW9iaWxlJTIwcGhvbmUlMjBkYXRh'); return false;">mobile phone data</div>

<div class="listing-category" onclick="window.quartoListingCategory('YWRtaW5pc3RyYXRpdmUlMjBkYXRh'); return false;">administrative data</div>

</div>
<div class="card-text listing-description delink">
<p>Merging administrative data and MNO data to estimate urban segregation at a local level</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2018
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>

</div>



</section>

 ]]></description>
  <category>in production</category>
  <category>webscraping</category>
  <category>CPI</category>
  <category>Insee</category>
  <guid>https://ssphub-test.netlify.app/project/2021_webscraping_ipc_hotel/</guid>
  <pubDate>Tue, 01 Jun 2021 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2021_webscraping_ipc_hotel/webscraping_ipc.png" medium="image" type="image/png" height="108" width="144"/>
</item>
<item>
  <title>Predicting growth by reading the newspaper</title>
  <link>https://ssphub-test.netlify.app/project/2021_gdp_media/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">
<h1>Project summary</h1>
<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>Construction of an economic sentiment indicator based on press articles to forecast economic growth</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>The information produced by the media is reactive, abundant and covers a wide range of economic fields. In addition, text analysis (text mining) and automated online data collection (webscraping) techniques can be used to develop indicators that summarise this wealth of information. The aim of the experiment is to estimate the added value of a media indicator associated with economic activity by complementing existing business surveys. This experiment complements pre-existing work, which is seeking to construct an indicator to help predict GDP on the basis of the archives of the newspaper Le Monde, by mobilising complementary journalistic sources and reviewing the models used.</td>
</tr>
<tr class="even">
<td><strong>Players</strong></td>
<td>Insee</td>
</tr>
<tr class="odd">
<td><strong>Project results</strong></td>
<td>An indicator of current media sentiment was constructed on the basis of the tone of press articles.</td>
</tr>
<tr class="even">
<td><strong>Project products and documentation</strong></td>
<td>- <a href="https://www.insee.fr/en/statistiques/2662636?sommaire=2662688">How to forecast employment figures by reading the newspaper</a> Economic outlook, March 2017 ; <br> - <a href="https://www.insee.fr/en/statistiques/3705981?sommaire=3706269">Nowcasting GDP Growth by Reading Newspapers</a> Economics and Statistics n° 505-506 - 2018 ; <br> - <a href="https://www.insee.fr/en/statistiques/4620524?sommaire=4473307">Information gleaned from press articles can help predict economic activity in real time</a> Economic outlook, December 2020 ; <br> - <a href="https://www.insee.fr/en/statistiques/5351871">French economic activity through press articles</a> Economic Outlook, March 2021 ; <br> - <a href="https://journees-methodologie-statistique.insee.net/predire-lactivite-economique-a-partir-darticles-de-presse/">Predicting economic activity from press articles (abstract in English)</a>, 2022 Statistical Methodology Days (Journées de méthodologie statistique 2022)</td>
</tr>
</tbody>
</table>
</section>
<section id="similar-projects" class="level1">
<h1>Similar projects</h1>
<div id="listing-similar-project" class="quarto-listing quarto-listing-container-grid">
<div class="list grid quarto-listing-cols-3">
<div class="g-col-1" data-index="0" data-categories="d2Vic2NyYXBpbmclMkNpbiUyMHByb2R1Y3Rpb24lMkNkYXRhdmlzdWFsaXNhdGlvbiUyQ1NJRVM=" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419523" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="1" data-listing-word-count-sort="174">
<a href="../../project/2022_Curiexplore/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_Curiexplore/curiexplore.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title" data-anchor-id="similar-projects">
Curiexplore, the platform for comparing national education and research policies
</h5>
<div class="card-text listing-description delink">
<p>Interactive visualisation of the teaching environment and research environment in different countries.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="1" data-categories="d2Vic2NyYXBpbmclMkNpbiUyMHByb2R1Y3Rpb24lMkNhdXRvbWF0aWMlMjBjb2RpbmclMkNEQVJFUw==" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419524" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="3" data-listing-word-count-sort="406">
<a href="../../project/2022_JOCAS/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_JOCAS/jocas.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Jocas, webscraping online job offers
</h5>
<div class="card-text listing-description delink">
<p>The project <code>Jocas</code> (Job offers collection and analysis system) project enables the DARES (Ministerial Statistical Office for Labour) to automatically collect job offers…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="2" data-categories="d2Vic2NyYXBpbmclMkNkYXRhdmlzdWFsaXNhdGlvbiUyQ29wZW4tZGF0YSUyQ2luJTIwcHJvZHVjdGlvbiUyQ1NJRVM=" data-listing-date-sort="1640995200000" data-listing-file-modified-sort="1778082419525" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="304">
<a href="../../project/2022_bso/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2022_bso/bso_en.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Open Science Monitor
</h5>
<div class="card-text listing-description delink">
<p>To be able to monitor the opening up of scientific publications (the objective of the <strong>national plan for open science</strong>), the statistical service of the Ministry of Higher…</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jan 2022
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="3" data-categories="aW4lMjBwcm9kdWN0aW9uJTJDd2Vic2NyYXBpbmclMkNDUEklMkNJbnNlZQ==" data-listing-date-sort="1622505600000" data-listing-file-modified-sort="1778082419522" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="244">
<a href="../../project/2021_webscraping_ipc_hotel/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2021_webscraping_ipc_hotel/webscraping_ipc.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Consumer price indices for hotel overnight stays: the experience of webscraping an online booking platform
</h5>
<div class="card-text listing-description delink">
<p>Exploring webscraping tools to fetch hotel overnight stays in CPI</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jun 2021
</div>
</div>
</div>
</div></a>
</div>
<div class="g-col-1" data-index="4" data-categories="SW5zZWUlMkNDUEklMkN3ZWJzY3JhcGluZyUyQ3JhbmRvbSUyMGZvcmVzdA==" data-listing-date-sort="1590969600000" data-listing-file-modified-sort="1778082419517" data-listing-date-modified-sort="NaN" data-listing-reading-time-sort="2" data-listing-word-count-sort="383">
<a href="../../project/2020_webscraping_ipc/index.html" class="quarto-grid-link">
<div class="quarto-grid-item card h-100 card-left">
<p class="card-img-top">
<img loading="lazy" src="https://ssphub-test.netlify.app/project/2020_webscraping_ipc/webscraping_ipc.png" class="thumbnail-image card-img" style="height: 150px;">
</p>
<div class="card-body post-contents">
<h5 class="no-anchor card-title listing-title">
Webscrape product characteristics to improve inflation measurement
</h5>
<div class="card-text listing-description delink">
<p>Collect product characteristics on the web to improve the way quality effects are taken into account in the consumer price index.</p>
</div>
<div class="card-attribution card-text-small justify">
<div class="listing-author">

</div>
<div class="listing-date">
1 Jun 2020
</div>
</div>
</div>
</div></a>
</div>
</div>
<div class="listing-no-matching d-none">No matching items</div>
</div>



</section>

 ]]></description>
  <category>Insee</category>
  <category>machine learning</category>
  <category>experiment</category>
  <category>forecasts</category>
  <category>webscraping</category>
  <guid>https://ssphub-test.netlify.app/project/2021_gdp_media/</guid>
  <pubDate>Mon, 01 Mar 2021 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2021_gdp_media/gdp_media_en.png" medium="image" type="image/png" height="53" width="144"/>
</item>
<item>
  <title>Comparison of matching methods and the contribution of machine learning</title>
  <link>https://ssphub-test.netlify.app/project/2021_Appariement/</link>
  <description><![CDATA[ 





<section id="project-summary" class="level1">

<table class="caption-top table">
<thead>
<tr class="header">
<th></th>
<th>Comparison of matching methods and the contribution of machine learning</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Project details</strong></td>
<td>The Resil programme aims to build a sustainable and scalable system of directories of individuals, households and residential premises, updated from a variety of administrative sources. It requires the aggregation of several data sources without a common direct identifier. <br> The aim of the experiment is to <strong>test and compare different matching methods</strong> in order to draw up recommendations for the work needed to build the directories. These will be based on performance criteria (quality of matching) but also on operational considerations (ease of deployment, calculation time, etc.). In particular, the aim is to <strong>assess the contribution and constraints of probabilistic methods and machine learning</strong> in matching tasks. This work will be accompanied by a reflection on the prior normalisation of data and the evaluation of matching results.</td>
</tr>
<tr class="even">
<td><strong>Stakeholders</strong></td>
<td>Insee</td>
</tr>
<tr class="odd">
<td><strong>Project products and documentation</strong></td>
<td>- <a href="https://journees-methodologie-statistique.insee.net/methodologie-des-appariements-de-donnees-individuelles/">Methodology for matching individual data</a>, 2022 Statistical Methodology Days (Journées de méthodologie statistique 2022) <br> - <a href="https://journees-methodologie-statistique.insee.net/probabilistes-ou-deterministes-des-methodes-dappariements-au-banc-dessai-du-programme-resil/">Probabilistic or deterministic, matching methods put to the test by the RéSIL programme</a>, 2022 Statistical Methodology Days (Journées de méthodologie statistique 2022) <br> - <a href="https://journees-methodologie-statistique.insee.net/impact-du-nettoyage-des-donnees-sur-la-qualite-dun-appariement/">Impact of data cleaning on the quality of a match</a>, 2022 Statistical Methodology Days (Journées de méthodologie statistique 2022) <br> - <a href="https://www.insee.fr/fr/information/8203044">Matching: aims, practices and quality issues (French)</a>, Working document, Insee, July 2024</td>
</tr>
</tbody>
</table>


</section>

 ]]></description>
  <category>matching</category>
  <category>administrative data</category>
  <category>in production</category>
  <guid>https://ssphub-test.netlify.app/project/2021_Appariement/</guid>
  <pubDate>Fri, 01 Jan 2021 00:00:00 GMT</pubDate>
  <media:content url="https://ssphub-test.netlify.app/project/2021_Appariement/Resil.png" medium="image" type="image/png" height="96" width="144"/>
</item>
</channel>
</rss>
