Abstract |
The huge expansion of digital products and the corresponding user generated and
gathered data have raised the importance of users privacy and privacy concerns.
Currently, businesses and organizations around the world are enforced by law (e.g. the
EU General Data Protection Regulation - GDPR) to provide information about how
customers' data is being treated, usually in the form of privacy policy documents. Despite
the fact that such regulations try primarily to give control back to citizens over their
personal data, it's a common case that users are not engaged in this process. Since such
documents are usually long and hard to read, users are not willing to spend a lot of time
to read and understand them.
A current direction for addressing this problem is towards enriching privacy policy
documents with annotations either through expert users or machine learning algorithms.
In this thesis, we designed and implemented an online crowdsourcing platform that
allows users to explore, annotate and review privacy policies of any kind of digital product
(e.g. mobile applications, websites, appliances, etc.) in a friendly way. The platform is part
of the tools designed for the CAPrice community, a collective awareness platform for
privacy concerns and expectations.
Privacy policies are being annotated using a predefined set of tags, designed to address
user concerns about what data are being collected and processed, by whom, for how long
they are retained, how they are secured, and other privacy concerns. Users can contribute
by adding entities like digital products, privacy policies, and annotations to documents or
by reviewing entities added by other users. The platform helps and engages users towards
this quest through various engagement tools (e.g. user scores) and document analysis
tools (e.g. readability scores of privacy policies). An annotation can be considered as a
valid or invalid one, based on the votes of the users and their aggregated score obtained
using the Wilson score interval. The aim is to provide a collaborative crowdsourcing
platform that will be considered the reference for user annotated privacy policy
documents, for users, developers, researchers and policy makers. Towards this direction
we have designed a ReST API that provides access to the database of digital products and
their annotated privacy policies. As a result, this information can be exploited for the
development of third party tools and algorithms.
We conducted a user-based evaluation of our platform, where users were split in two
groups. Each group was asked to annotate a specific set of privacy policies obtained from
the OPP-115 dataset, which is an expert-based annotated collection of privacy policies.
Then each group had to review/vote the annotations of the other group and fill in the
corresponding questionnaire. The analysis of the results shows the user friendliness of
our platform and that the gathered crowd-sourced privacy policy annotations are of high
importance and quality, comparable to annotations created by expert users.
|