{"id":6027,"date":"2023-03-06T13:10:46","date_gmt":"2023-03-06T13:10:46","guid":{"rendered":"https:\/\/www.unhcr.org\/blogs\/?p=6027"},"modified":"2023-03-06T15:24:34","modified_gmt":"2023-03-06T15:24:34","slug":"how-a-hackathon-supported-personal-data-protection","status":"publish","type":"post","link":"https:\/\/www.unhcr.org\/blogs\/how-a-hackathon-supported-personal-data-protection\/","title":{"rendered":"How A Hackathon Supported Personal Data Protection"},"content":{"rendered":"<p>[et_pb_section fb_built=&#8221;1&#8243; admin_label=&#8221;section&#8221; _builder_version=&#8221;4.16&#8243; custom_padding=&#8221;1px|||||&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_row admin_label=&#8221;row&#8221; _builder_version=&#8221;4.16&#8243; background_size=&#8221;initial&#8221; background_position=&#8221;top_left&#8221; background_repeat=&#8221;repeat&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_column type=&#8221;4_4&#8243; _builder_version=&#8221;4.16&#8243; custom_padding=&#8221;|||&#8221; global_colors_info=&#8221;{}&#8221; custom_padding__hover=&#8221;|||&#8221;][et_pb_text _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;]<\/p>\n<p>By <a href=\"https:\/\/www.unhcr.org\/blogs\/blog-authors\/federico-sanson\/\">Federico Sanson<\/a><\/p>\n<p>[\/et_pb_text][et_pb_image src=&#8221;https:\/\/www.unhcr.org\/blogs\/wp-content\/uploads\/sites\/48\/2023\/03\/RF2247348_201909031609_IMG_0019_ES-scaled.jpg&#8221; title_text=&#8221;Argentina. UNHCR, supporting Venezuelan refugee and migrant families&#8221; _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][\/et_pb_image][et_pb_text admin_label=&#8221;Text&#8221; _builder_version=&#8221;4.20.0&#8243; background_size=&#8221;initial&#8221; background_position=&#8221;top_left&#8221; background_repeat=&#8221;repeat&#8221; custom_padding=&#8221;0px|||||&#8221; global_colors_info=&#8221;{}&#8221;]<\/p>\n<figure class=\"wp-block-image size-full\"><figcaption><span style=\"font-size: small;\"><em>UNHCR staff pilot the Communicating with Communities survey in Argentina, to assess the needs of Venezuelan nationals, at a community event in Buenos Aires. <br \/>\u00a9 UNHCR\/Eliana Sarraf<\/em><\/span><\/figcaption><p><span style=\"font-size: small;\"><em> <!-- \/divi:image --><\/em><\/span><\/p>\n<\/figure>\n<p><span>In today&#8217;s rapidly evolving digital world, personal data protection has become a critical issue. With the exponential growth of big data, large amounts of personal information are being collected, stored, and processed, making privacy a major concern for individuals, organizations, and governments alike. To safeguard personal data, it is vital to adopt Privacy Enhancing Technologies (PETs), which are a range of innovative technical solutions designed to enhance privacy and protect personal data. These technologies use encryption and other methods to secure personal information and prevent unauthorized access. As technology continues to advance, so does the potential for privacy technologies to protect personal data in a variety of ways.<\/span><\/p>\n<h2><span>A hackathon provided the opportunity to explore privacy enhancing technologies in a real-life scenario<\/span><\/h2>\n<p><span>The United Nations\u2019 Privacy Enhancing Technologies Lab<a href=\"#_ftn1\" name=\"_ftnref1\">[1]<\/a> and UNHCR, the UN Refugee Agency, hosted a <a href=\"https:\/\/petlab.officialstatistics.org\/\">data science hackathon<\/a> during the <a href=\"https:\/\/unstats.un.org\/bigdata\/events\/2022\/conference\/\">7<sup>th<\/sup> International Conference on Big Data and Data Science for Official Statistics<\/a>, that took place in Yogyakarta, Indonesia, between 11-14 November 2022.<\/span> <span>The competition was devised to increase awareness of PETs, a range of innovative technical solutions designed to enhance privacy and protect personal data, and their potential to allow data access for tackling important societal questions. To make the competition as close to real-life challenges as possible, UNHCR provided a dataset from its <a href=\"https:\/\/microdata.unhcr.org\/index.php\/home\">Microdata Library<\/a>. Participants were asked to analyze survey data collected from refugees during the COVID-19 pandemic in Kenya to understand the main factors contributing to their social and economic vulnerability. The 72-hour long hackathon saw around 300 teams representing national statistical organizations (NSOs), data science start-ups and academic research centers from over 30 countries participating. <\/span><\/p>\n<p><span>During the hackathon, participants used machine learning to estimate unknown values of the dataset and were evaluated according to the accuracy of their predictions. An additional challenge of this hackathon was that the data provided was not complete, so participants were not able to directly view the sensitive variables, but had to interact with them through privacy-preserving methods.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><!-- \/divi:paragraph -->[\/et_pb_text][\/et_pb_column][\/et_pb_row][et_pb_row column_structure=&#8221;1_2,1_2&#8243; _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_column type=&#8221;1_2&#8243; _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_image src=&#8221;https:\/\/www.unhcr.org\/blogs\/wp-content\/uploads\/sites\/48\/2023\/03\/IMG_1-scaled.jpg&#8221; title_text=&#8221;IMG_1&#8243; _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][\/et_pb_image][et_pb_text _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;]<\/p>\n<p><em>The participants of the 7th International Conference on Big Data and Data Science for Official Statistics in Yogyakarta, Indonesia. \u00a9 BPS-Statistics Indonesia<\/em><\/p>\n<p>[\/et_pb_text][et_pb_text _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][\/et_pb_text][\/et_pb_column][et_pb_column type=&#8221;1_2&#8243; _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_image src=&#8221;https:\/\/www.unhcr.org\/blogs\/wp-content\/uploads\/sites\/48\/2023\/03\/IMG_2-scaled.jpg&#8221; title_text=&#8221;IMG_2&#8243; _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][\/et_pb_image][et_pb_text _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;]<\/p>\n<p><em>Federico Sanson (on the right) participated in a panel at the International Conference on Big Data. \u00a9 BPS-Statistics Indonesia<\/em><\/p>\n<p>[\/et_pb_text][\/et_pb_column][\/et_pb_row][et_pb_row _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_column type=&#8221;4_4&#8243; _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_text _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;]<\/p>\n<h2><span>Privacy enhancing technologies are vital to safeguard personal data in UNHCR<\/span><\/h2>\n<p><span>When sharing data in sensitive humanitarian contexts, it is essential that the privacy of individuals and the confidentiality of their information is preserved. UNHCR carries out surveys and other data collections on a regular basis and makes anonymized datasets available on the Microdata Library, where datasets are curated and can be downloaded by researchers, partners, and other stakeholders.<\/span><\/p>\n<p><span>These datasets may contain personally identifiable information and before sharing them, UNHCR carries out an anonymization process, which uses statistical disclosure control techniques to reduce the risk of reidentification of a single individual. However, in some cases datasets may be too sensitive to share, which means that a lot of the analysis, from which much public good could be derived, is in practice unavailable. However, PETs are a range of novel data-processing techniques, which might make such sharing possible. <\/span><\/p>\n<p><span>The PET used during the hackathon was based on the concept of differential privacy. The idea behind differential privacy is that if the effect of making an arbitrary single substitution in the database is small enough, the query result cannot be used to infer much about any single observation. The goal is to give each observation roughly the same privacy that would result from having their data removed from the original dataset. <\/span><\/p>\n<p><span>As the participants did not receive the full dataset, they could query \u2013 or ask \u2013 the PET tool for certain information, such as the number of male or female respondents or the mean of their household expenditure. The tool would give an almost correct answer as it will add \u2018statistical noise\u2019 for added security. Each interaction with the data had a cost though. The magnitude of this cost was determined by how much noise the participants were willing to have added to their queries; the noisier the query, the cheaper it was.<\/span><\/p>\n<p>[\/et_pb_text][\/et_pb_column][\/et_pb_row][et_pb_row column_structure=&#8221;3_5,2_5&#8243; _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_column type=&#8221;3_5&#8243; _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_image src=&#8221;https:\/\/www.unhcr.org\/blogs\/wp-content\/uploads\/sites\/48\/2023\/03\/blog.jpg&#8221; title_text=&#8221;blog&#8221; _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][\/et_pb_image][\/et_pb_column][et_pb_column type=&#8221;2_5&#8243; _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_text _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;]<\/p>\n<p><span>As the figure on the left shows, the \u2018X\u2019 data was available to the participants, while the \u2018Y\u2019 data was stored on a different server. The <em>Train Y<\/em> data could be accessed only through the PETs frameworks provided. The <em>Test Y<\/em> data could not be accessed at all, and the final goal was to estimate it. Participants had to train a machine learning model using <em>Train X<\/em> (accessible) and <em>Train Y<\/em> (only accessible through PETs frameworks). They then had to use <em>Test X<\/em> (accessible) as input of the machine learning model to calculate <em>Test Y<\/em> (not accessible at all).<\/span> <span>Final scores were determined by a trade-off between the accuracy of the predictions and the total cost of all queries a team made.<\/span><\/p>\n<p>[\/et_pb_text][\/et_pb_column][\/et_pb_row][et_pb_row _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_column type=&#8221;4_4&#8243; _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][et_pb_text _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;]<\/p>\n<h2><span>Privacy enhancing technologies help to create a more secure digital environment<\/span><\/h2>\n<p><span>The hackathon\u2019s PETs demonstrated to be effective in delivering useful results while preserving the privacy of the dataset. Additionally, the participants reached a good level of accuracy in their predictions, with the winning team getting to an accuracy of around 80 per cent. Lastly, the participants did not have any difficulty in understanding and working with UNHCR\u2019s dataset, even though they were not necessarily familiar with forced displacement data. This shows the high quality of the datasets available on UNHCR\u2019s Microdata Library and the clarity of their documentation and metadata.<\/span><\/p>\n<p><span>As big data continues to grow in significance and usage in the humanitarian context, the use of privacy enhancing technologies is becoming increasingly important to ensure the security and privacy of personal information. For UNHCR, the insights gained from the hackathon will inform ongoing work to explore how PETs can be used to enable safe sharing of sensitive data, to improve decision-making without compromising individuals\u2019 privacy. Finally, the use of these technologies is crucial for creating a more secure digital environment, particularly for vulnerable populations such as the people we serve.<\/span><\/p>\n<p>[\/et_pb_text][et_pb_divider _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;][\/et_pb_divider][et_pb_text _builder_version=&#8221;4.20.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221;]<\/p>\n<p><a href=\"#_ftnref1\" name=\"_ftn1\"><span>[1]<\/span><\/a> The United Nations\u2019 Privacy Enhancing Technologies Lab is a collection of national statistics organizations and technology experts collaborating to modernize the way data are shared and statistics are produced.<\/p>\n<p>[\/et_pb_text][\/et_pb_column][\/et_pb_row][\/et_pb_section]<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In today&#8217;s rapidly evolving digital world, personal data protection has become a critical issue. With the exponential growth of big data, large amounts of personal information are being collected, stored, and processed, making privacy a major concern for individuals, organizations, and governments alike. To safeguard personal data, it is vital to adopt Privacy Enhancing Technologies (PETs), which are a range of innovative technical solutions designed to enhance privacy and protect personal data. These technologies use encryption and other methods to secure personal information and prevent unauthorized access. As technology continues to advance, so does the potential for privacy technologies to protect personal data in a variety of ways.<\/p>\n","protected":false},"author":899,"featured_media":6029,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_et_pb_use_builder":"on","_et_pb_old_content":"","_et_gb_content_width":"","footnotes":"","_links_to":"","_links_to_target":""},"categories":[8,21,49,537,23],"tags":[521,529],"class_list":["post-6027","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data","category-data-and-statistics","category-innovation","category-poverty","category-socioeconomic-inclusion","tag-refugee-data","tag-unhcr"],"_links":{"self":[{"href":"https:\/\/www.unhcr.org\/blogs\/wp-json\/wp\/v2\/posts\/6027","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.unhcr.org\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.unhcr.org\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.unhcr.org\/blogs\/wp-json\/wp\/v2\/users\/899"}],"replies":[{"embeddable":true,"href":"https:\/\/www.unhcr.org\/blogs\/wp-json\/wp\/v2\/comments?post=6027"}],"version-history":[{"count":5,"href":"https:\/\/www.unhcr.org\/blogs\/wp-json\/wp\/v2\/posts\/6027\/revisions"}],"predecessor-version":[{"id":6051,"href":"https:\/\/www.unhcr.org\/blogs\/wp-json\/wp\/v2\/posts\/6027\/revisions\/6051"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.unhcr.org\/blogs\/wp-json\/wp\/v2\/media\/6029"}],"wp:attachment":[{"href":"https:\/\/www.unhcr.org\/blogs\/wp-json\/wp\/v2\/media?parent=6027"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.unhcr.org\/blogs\/wp-json\/wp\/v2\/categories?post=6027"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.unhcr.org\/blogs\/wp-json\/wp\/v2\/tags?post=6027"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}