FR: CNIL - Proposals for privacy-by-design age assurance systems
Published: Apr 25, 2024
Duration: 00:34:00
Category: People & Blogs
Trending searches: cnil
we'll be kicking off in about 10 seconds if that's the case if people are up for fing their seats so thank you everybody for joining us our next speaker is antoan G from Kil um talking about proposals for privacy by Design age Assurance systems um antoan is an IT and privacy expert working for the French data regulator Kil and Theiss he has been pioneering analysis of double blind solution for securing age verification in a way that protects identity and civil liberties so over to you antoan thank you hello everybody uh thank you for being here thank youour thank you also for uh to the organizers of this uh Summit and for allowing us to speak here so as uh George said uh I will speak today about uh privacy by Design in age Assurance system and trying to expose the risk-based approach that we've been following to design to to push for systems that will get more privacy and security without uh losing too much of the uh effectiveness of age verification and age restrictions online um privacy by Design is actually uh defined in gdpr article 25 called Data protection by designning by default and pushes for data controllers to uh Implement state-of-the-art um organizational and Technical measures in order to protect uh data subjects rights so this is uh from where I stand as a George said I work at theil which has been the French data protection authority since 1978 and is now a member of the European data protection board since the uh coming into full force of the gdpr in 2018 we've had of course a long-term interest uh in uh digital identity and digital access management and I consider we consider that age verification is actually some subpart of identity management with this little uh twist that estimation is possible which is not possible with identity you cannot estimate the identity of someone so it creates some new opportunities and new challenges for both both security and privacy and Effectiveness um theil has been working on age Assurance for maybe three four years specifically and we've had several Publications on the subject exploring both the age Assurance methods and their implications in terms of privacy and Effectiveness and uh we have also published uh proof of concept using zero knowledge proof protocol so group signature in order to with a Professor from EOL poly technique in in Paris SLE and um trying to push solutions that would allow age verification to not bring new privacy risks to users and this is as you will see this is the core uh idea that we're we're pushing that age verification should be done and I we think can be done without new uh risks to the privacy of children of course and all users that will use these systems uh as for me I am an information technology and privacy expert so I work in the technical part of our institution I will try to be not too Technical and I will not be very legal either and I will try to uh speak more about the the risks uh since as I said since we are trying to push a risk-based approach because as a regulator we're trying to not be prescriptive on technology but more prescriptive on the uh principles that should be uh attained by the technology developed and to promote Innovation because we are sure that there are solutions out there that we cannot think of and that will match all the criteria which uh for a good age verification or age Assurance system um the basic problem as we all know is that uh some platforms some services online that we will call relying party this is the vocabulary for the from the digital identity uh Community some some of these relying parties have uh are subject to age related restrictions or processes these restrictions can be internal policy external legislation regulation or uh inner processes where we want information or experiences or design to be adapted to the age of the users and in this case when the user access register or interact with the platform the platform will need uh to get some data about this user the use case that you all know are Access Control to things like violent content or pornography registration on sub website depending on the legislation social media online gambling Etc specific minor protection that come under certain uh legislation in the gdpr we have for example parental consent for miners under a certain age so you have to know that the subject is a minor if you want to apply this specific minor protection to the to the data subject or Online safety where uh data controllers will try to create virtual safe spaces along ing children to uh enjoy the digital world in a in in a context and with interactions that are suitable for them and for their uh age group so the question we're asking today and that we have been asking ourselves for quite some time is the following how can the relying party obtain information about the age of a user and uh ensure that the their age related restrictions are enforced in a privacy preserving way of course there are are already privacy risks of web browsing when you're using any platform or relying party that has a hate restriction or not you interact with this platform and you are giving some data to this platform so uh profiling is already in place the the the the website can uh using cookies using fingerprinting using digital footprint your online Behavior get some idea about who you are and what you're doing on the on the website and of what category of user uh you belong to as this is possible if you come several time on the same website using either the same computer same browser same cookie sessions or of course if you register on this website where you get a stronger identification from the uh relying party from the platform uh in the form of maybe your email address if you're registered and there's your name in your email address or if you've paid an online service for delivery and you've provided some uh contact information your address your real name your credit card Etc so this is already a risk for any one of us using um the internet other risks uh I will not be exhaustive in the list of privacy risks of using the internet but other risk include the non-respect of user consent and their rights and gdpr we I'm talking from the French point of view and of course data leakage and bridges so unauthorized access and illegitimate access to the data that is possessed by the data controller um these are already in place and um age verification age Assurance or not this is something that we expect but in these risks except if you are providing in your registration or payment data or this kind of specific data you're saying exactly who you are usually if you are using a brand new computer or a computer in a in a in a hostel with a public Wi-Fi you will not be identified you will they will know somehow where you are with your IP address from what you're doing but they will not be able to be certain of who you are now we are adding age Assurance so when the user wants to access the raying party the raying party will requires some age assurance and the user will have to provide some information so that the relying party can estimate or know their age and act in an according way and then the ring party takes the decision on whether I grant access I grant registration and what information I give you based on this information so the question the next question is what are the specific risks because we've talked a little bit about these privacy risks what are the specific risks associated with this process one two three age Assurance providing information and checking this information online I will uh talk for about self- declaration for just uh 30 seconds to get it out of our way self- declaration uh is asserted age Assurance it's a it has been in use for a long time and I don't think it has protected a lot of minor from accessing uh online content which is inappropriate maybe some but not all of them and it works in a very simple way uh the relying party will ask the user for age assurance and the user will provide information without any backing evidence either I'm a major or I'm 26 or this is my date of birth but there is absolutely no check no uh backing evidence of this and I can claim that I'm 34 if I'm six or that I'm six if I'm uh 81 and the reliance party again takes a decision based on this information there is a specific privacy risk here even though small which is if the user is uh actually honest and is giving his real date of birth uh then they there can be data collection from the relying party in the form of the date of birth of the user which is quite identifying especially in the context of already having IP address and other uh information that may lead to let's say a home a school or an environment and then the date of birth can provide more identifying information and the other risk which is not about privacy but which is really important is that the system is really not effective uh it's really easy to bypass really easy to not give uh real information and it will not protect probably not protect anyone so the new question is how to effectively apply these age Rel restrictions and processes the goal is to get reliable information so information that is about the user and that is a true information and that is user bound which means that the user that is giving this reliable information is actually the person to which this information refer to and this is where uh digital identity comes uh into play what are the usual factors of identification and authentification because what we are talking about here is reliable information about a person this is some kind of identification of this person and making sure that this person is the person we say it is so authentification of this person usually we say there are three types of factors for this authentification online what we call Knowledge Factor so things people know like your password your login your credentials and this can be used for age Assurance for example if I had uh if I log in my social media account or I log in into a My Provider of online uh digital identity there can be some information going through uh from my digital identity provider to the relying party so that the relying party knows that I have access to this uh digital identity and it's probably me I say probably because the user binding here is low um if the authentication is based only on on a knowledge Factor this is easily shared I can if I have a digital identity which have actually very little information about me I have no problem maybe to share it or to sell it online or to put it on a website that people will be able to use and then since this user biing is solo let's say the effectiveness will be um an a measure of how I take a risk by sharing the credentials if the credential allow me to log to my tax payment system or to my government accounts then probably I will not want to share it and the user binding will be quite high but in this case I maybe I would not want to use this connection to connect to a pornographic website or online gambling site maybe so it is a possibility but we think it has not been uh very very extensively studied because of this low user user binding then the second type of uh authentication factor is the position Factor possession factor is what you have something which will prove that you actually have one thing that you already had when you registered or something like this so this can be a credit card a mobile phone an ID card and these are a little more bound to the user because normally it exists only one copy of this physical item and okay I can give it to someone I can lend my mobile phone to someone I can lend my credit card to someone but I cannot put it online and uh and people will not be able to use it on a massive scale uh which could be the case with a simple knowledge Factor so we say the user binding is higher than with the knowledge Factor but still not really high since I can still share this and there may be ways to circumvent um these systems for example a credit card I have the credit card but actually the credit card numbers and what is required to use a credit card online is more a knowledge Factor than a possession Factor so it's it has to be each specific context has to be studied uh independently and the third uh authentication Factor uh usually defined is the inherence factor so what you are um this is for example the analysis of facial features for biometric analysis and analysis of your voice analysis of the shape of your hand or things like this uh which are really bound to you it's really hard to well okay there are generative AI now but usually it's quite hard to share your face share your voice uh and uh spoofing with AI is a completely other topic that I think uh uh some people talked about yesterday and we'll talk about more but it's not the subject of my presentation here and this is considered to have a high uh bounding binding factor to the user and of course if you really want to bind the the D the authentication to the user you will have multiactor authentication and you will for example combine presenting your ID card which is a possession factor with your facial features so that there can be a biometric match between your face and your card and this will have very strong user binding the problem is reliable data data and strong user binding increases identification risk so we were in in a situation where the website in the let's say classical browsing uh um scenarios will be able would be able to collect data from you but not to identify you uh properly but if we're using more and more uh user bound authentication factors then we're steadily increasing the risk of identification by a lot of uh services and Bodies online uh so basically it's the the main risk of this system is to create a loss of let's say pseudonymity uh over the Internet which is something that we data protection authorities uh don't really like um at least we want the users to have a choice um now the the most basic form that we can imagine to go forward from self- declaration is that the relying party is in charge himself of uh doing their age Assurance so the the user wants to access a relying party relying party has to do some age Assurance sorry uh the user will provide some information to the relying party the relying party will process this information and takes a decision for example I'm showing you my ID card and my face your are I don't know uh some social media that wants to verify my account for to unblock it or I don't know and I have to present this information the social media will maybe process it himself and takes a decision based on this information um the the Privacy risks here is that of course of course the data that was already collected and the profiling that was already possible by the uh online website will now have identifying data in addition so the profile will be a lot more valuable a lot more stronger because it's linked to you and even though you change your uh cookie session your uh computer your fingerprints well it can be directly traced to your direct identity so the profiling is can be a lot stronger whenever you provide um this um user bound uh dat data and of course the data leaks and bridges so um will be more severe since all the data will be uh consolidated and more strongly associated with real persons and the third problem here is that if each relying party has to make its own uh age Assurance or age verification it means that we people will be used to providing their ID or their face or their directly identifying information to lot of different online platforms and services so this will become kind of uh normal to provide all this information which means that uh uh our um how do you say that our uh we will lower our guard and you will probably provide information that you would not you should not have provided which can lead to scamming fishing ID spoofing and this is something that we see more and more as we are using more and more the digital world to process this digit this identity data the other possible risks are uh about the possible am I touching something oh okay the is about the possible conflict of interest because if you're a platform what you want is as many users as possible to come in your services and at the same time you're responsible for blocking some of the uh people connecting to your services so there may be some conflict of interest and the auditing is a bit difficult because you have 100 hundreds or thousands or tens of thousands of relying parties to make sure that they properly Implement age Assurance or age verification system that is pro uh adapted to your uh regulation so moving forward the idea was to ask for a third party age Assurance provider someone which will be in charge of doing these age Assurance checks and that probably uh will have little less conflict of interest and will be easier to audit because there will be less bodies to audit and these will be um these will be then deployed by all the relying parties so the the concept is simple I give information to the relying party the ring party has a contract with I called it integrated third party because it is like transparent to the user it's just a subcontractor uh of the relying party and the relying party transfers the user's information to an age Assurance provider which then process the collected data and provides age Assurance to the ring party in this scenario we still have a high risk of profiling which is actually improved uh increased because the user provides his directly in identifying information to both the ring party and the age Assurance provider so there is still a possibility of profiling coupled with uh directly identifying data by the relying party and this data can be associated with sensitive data such as uh Health Data or uh sexual orientation data or uh addiction problems if we're talking about pornography about gambling websites Etc so this this profiling is problematic um we still have the problem of data laks that can come from now the ring party and the age Assurance provider if the security measures are not properly in place and scamming fishing is still possible uh since we're still giving our information directly to a lot of different relying parties we have a we think this however lower the r risks of a conflict of interest and make auditing of these Services easier if we go further uh we we can change a little bit of the Paradigm so that the user will not provide this is information to the ring party but directly to the age Assurance provider which is a better situation because the relying party which is for example uh as I said gambling website or pornographic website or social media um does not directly get the uh ID or the picture of the user so we're reducing the risk of uh profiling and data Bridges including identity from a whole bunch of riing party and this profiling and and data bridge is more uh concentrated on age Assurance providers which is supposed to be a lower number of uh entities and uh eager to get audited and certified so that they comply with uh regulation uh we still have some profiling uh including profiling with identifying data by the age insurance provider which is possible if the proper measures are not in place but we reduce these risks uh a little bit then what we have been uh advocating and this was actually the proof of concept that was published by knil a little more than two years ago is to get a third party that is completely independent from the relying party so the user when he wants to access or um website a relying party that requires age Assurance will then turn to his favorite age Assurance provider let's say it has to be in the list of age Assurance providers that is known to the ring party but he will choose his preferred age Assurance provider provides information um which will give him for example an age certificate which will be provided to the relying party which will take a decision based on this age certificates if we're just looking at this scheme we have a new um risk here which is that we are going back to a weak user binding because in the previous case um in the previous cases since the age Assurance provider was directly in contact with the relying party the relying party had trust in this uh in this age Assurance provider and when he was receiving some uh age certificate of or age Assurance from the provider it would say okay I know this provider it's my uh usual provider it's it's a fine one and I trust him but there since it's the user that will uh provides the age certificate it's quite hard to know if this certificate is true if this certificate is really uh that really comes from an age assurance that has been done on this user and that has not been shared between different users Etc so we we are increasing uh we are lowering the the the profiling risk because here the age Assurance provider does not know uh What uh website I'm visiting but we are increasing we're lowering the effectiveness as well because we are decreasing The Binding of this age attribute to the user if we continue again um um okay yeah I forgot one thing is that this scenario I called it real time age Assurance by an independent third party which means each time I connect to the relying party that asks me for age Assurance I will also connect with my age Assurance provider and the age insurance provider will not necessarily know for which relying party I'm asking for an age certificate but he will know that I'm right now it's uh I don't know it's 5 to 11 I need an age certificate to to prove that I'm 18 plus to to go on a website and if I do that every day at 9 every day at 12 and every day at 11:00 they will know that maybe I am addicted to gambling maybe I go to porn website I don't know but there still will be some uh uh TR traces uh Footprints of my activity in the age Assurance provider uh side as well which Al also have directly identifying data uh on Me So to avoid this uh we have thought but I we are sure that there are other Solutions and but we have thought but we have read actually because this is uh typical in the digital identity services that users should be able to store their credentials uh from the age Assurance provider so beforehand before going to the relying party I will provide my information to some age insurance provider or digital identity provider who will in uh gives me an age certificate that I will store and be able to use over time let's say for a week let's say for a month let's say for a year depending on the uh different characteristics that this certificate has and the the characteristic of the age Assurance provider and the user binding that I can add to this certificate to make sure it's mine when I'm using it then I give the certificate and the relying party will uh Tak this decision based on these certificates again we have some kind of weak user binding because we're not we're still not sure that the toor age certificate is has been issued to the user that is using it but uh we are reducing the profiling risk on the age Assurance provider uh side um which is why in and we're coming to decentralized Identity models which is one of the model we think is adapted also to age verification but not the only one um where the age Assurance provider will use some cryptography material to sign the certificates to make sure that they cannot be forged and cannot be uh altered and they will either provide a register registry some places where the signatures authorized to make these certificates are stored and which can be consulted by the relying data so that when the relying uh the relying party sorry so that when the relying party has to verify the age certificate they can at least make sure that this is not a forged certificate and this comes from an authorized Source we still have some profiling based on age Assurance because when you receive the certificate maybe it has some information about the user uh that is more identifying than the cookies uh Footprints and fingerprints but uh still we are quite reducing the risk since we don't have a call from the user to the age Assurance provider each time a a ring party requires age assurance and we are also uh reducing the forgery risk since the certificate are signed um and uh now the the more like let's say uh recent and state-of-the-art but at the same same time in development technologies that we can add to this whole decentralized identity is that um when the user stores his age certificate he will use some kind of secure element so for example uh trusted exe execution environment on a mobile phone or uh it can be also some kind of a chip so what you have a possession factor in which you can store some certificate that will make sure that this uh certificate cannot be shared online uh to many different users and to strongly reduce uh strongly increase the user binding the the of this uh proof of age and another Improvement of over this scenario is which is the one we've been pushing as well but we know that it is not yet fully standardized but it is coming to uh it is coming to a good um uh uh maturity in right now and in the following years we hope is what our called zero knowledge proof protocols which will allow me to uh prove to the relying party that I own an age certificate uh from an registered age Assurance provider without actually uh giving any other information that can lead to profiling so we're reducing that risk of profiling based on the age Assurance material that have been provided beforehand and this is let's say the the typical uh decentralized identity uh solution that uh we think is appropriate we don't we we think there are other Solutions and we think there needs to be other solution because people maybe will not want to uh have a secure element or store something in a digital wallet on their phone or maybe they will need age Assurance in a context where they don't have much with them and they will need to be able to get age Assurance without relying on these technological brecks and uh and real uh government issued ID to get a proper certificate Etc so other uh Solutions are open but we have this um this is the kind of uh way because we all we often hear about risk-based approach to choosing a solution that match the risk of harm to children in the in the world of age assurance and age URI ification and we wanted to say that this risk-based approach is also very appropriate and necessary on the privacy and security part and that uh even the problem we have is that most of the age Assurance or uh verification methods rely on directly identifying data and that uh we know that this is necessary to get strong user binding and to get reliable information but we think with technology and with architecture of such systems we can mitigate most most of the security and privacy risks and this is what I wanted to share with you today uh other possibilities because I I was saying that you should be able to do a age estimation or age Assurance in any context even if you don't have your ID uh with you maybe and age estimation comes to mind here because it's a specificity of age Assurance compared to digital identity where looking at my face you can probably know that I'm more that than 18 years old and uh this has been developed a lot especially by companies in the UK but also other companies and we think it's a it's great that this tool exists and uh we know several uh companies are pushing to move age estimation from an an online services to uh the user's device the browser or the cell phone which is again a very strong good idea because instead of uh providing directly uh identifying data to a third party you're keeping your directly identifying data with you and you're just getting okay getting the software from the uh the age Assurance provider which Inns will be used to provide age Assurance to your uh relying party uh We've eliminated here most of the risks or that's what we think but of course there are all the implementation risks so these systems are quite complex and require good implementation and careful technical planning and uh and auditing and of course we encourage uh uh Open Standards open source software and uh bug bounties and everything that can uh make sure that implementation actually matches the the the the objectives that we had and to conclude so what we the questions we had were what are the specific risks associated with age assurance and we said over collection of data mostly of the identity data and profiling including the identity and or sensitive data of children and adults alike and low efficacy of course if we're using technology that is not uh user binding enough the other question was how to effectively apply age related restrictions and process and we answered we need reliable information and user binding and the third question was how to the relying party can obtain the information about the age of a user in a privacy PR uh preserving way and our uh idea about this is that the information should be obtained most of the time directly from the user and that cryptography and secure Hardware can be used to build trust on the user and avoid the self- Declaration scenario and avoid profiling uh based on the age Assurance provided thank you for your [Applause] attention yes [Music] um the microphone yes sorry um are we going to certify schemes or uh providers um actually we are not officially the regulator on this topic because the regulator on this topic is more the akam which is the audiovisual regulator and now the digital services coordinator under the DSA and they're the let's say the main regulator over these age restrictions online uh but the law that is still under scrutiny by the parliament in France say that akom will have to uh get uh input from theil on the data protection side uh to to to make sure that the systems that they Implement uh actually uh restrict children from accessing uh inappropriate content or experiences but also is comp compliant with gdpr and the uh state-of-the-art data protection so we're not directly regulating this but we're trying to get uh these inputs to the audio visual regulator so that it is taken into account so could we have further questions after F okay because we're running a bit behind um I think we're five we're five minutes behind already so if we can later that's okay thank you so much on I think it's it's really exciting to see all the work is this your back thank you to see all the work that um data protection authorities are doing on