Behind the fine print: Understanding AI apps and privacy

Artificial intelligence has quickly become part of the contemporary zeitgeist — yet ethical considerations around the subject remain unresolved. How many users are fully aware of what they’re signing up to?

Here, by honing in on the terms and conditions and privacy policies behind the most popular AI tools and apps, Ecommerce Platforms unpacks what you need to know when using these tools for your day-to-day business needs.

We’ve analyzed the data and personal information these tools collect (and for what purpose) to help you to determine which AI tools, software, and platforms are the most suitable for your intended use. We also consulted a legal expert to break down the jargon behind these tools’ terms and conditions.

We analyzed the Apple App Store privacy labels for around 30 available mobile app versions of popular AI tools to understand which ones collect your data, and why.

The data collected from users (and its purpose) is divided into 14 categories, making it possible to establish which apps collect and track the most user data.

For further details, take a look at the methodology section at the end of this page.

What data do these AI apps collect?

The AI tools assessed in this research collect data of various types. Some of these focus on personal details about users — from screen names and bank details, to their health and fitness, and even sensitive information such as race/ethnicity, sexual orientation, gender identity, and political opinions.

Others relate to content created by users (like emails and messages, photos, videos, and sound recordings), or how users interact with the product itself, like their in-app search histories or what adverts they’ve seen. More impersonal still is the diagnostic information collected to show crash data or energy use.

Why do these AI apps collect data?

There are different reasons why apps collect data, some of which may be seen as more justifiable than others — for example, biometrics or contact information can be used to authenticate the user’s identity.

Similarly, access to certain data may be required for an app to function correctly, including to prevent fraud or improve scalability and performance.

More specifically, messaging apps need access to contacts, phone cameras, and microphones to allow calls, while geolocation is necessary for taxi or delivery apps.

Arguably less essential reasons to collect data include advertising or marketing by the app’s developer (for example, to send marketing communications to your users); enabling third-party advertising (by tracking data from other apps to direct targeted ads at the user, for instance); and analyzing user behavior for purposes including assessing the effectiveness of existing features or planning new ones.

Hey mobile user 👋 The tables below can be scrolled horizontally to see all the data, but this page definitely looks better on Desktop.

Column headers with buttons are sortable.
AI app	% data shared with others	Browsing History	Contact Info	Identifiers	Location	Other Data	Purchases	Search History	Usage Data	No. of data points collected
Canva	36%		2	2	1			1	2	8
Duolingo	36%			2	1	1	1		2	7
Google Assistant	21%	1						1	1	3
Bing	14%			1					1	2
Pixai	14%			1					1	2
Wombo	14%			1					1	2
ChatGPT	7%								1	1
Genie AI	7%			1						1
Lensa	7%			1						1
Speechify	7%			1						1
StarryAI	7%								1	1

Of all the AI apps included in our research, Canva, a graphic design tool, collects the most data from its users for third-party advertising purposes — around 36%. By contrast, the five apps that collect the least data for this purpose gather just over 7%.

The data that Canva’s app collects from you and shares with third parties includes your search history, location, email address, and other information shown in the table above.

Closely following Canva is the gamified language-learning app Duolingo (~36%), Google Assistant (~21%), and Microsoft’s Bing (~14%) — all of which also share your data with third parties.

Of the five apps that collect the least data, only starryai (an image-generator) confines itself to solely sharing usage data.

AI apps that collect your data for their own benefit

Column headers with buttons are sortable.
App	% data collected for app’s own benefit	Browsing History	Contact Info	Identifiers	Location	Purchases	Search History	Usage Data	No. of data points collected
Canva	43%		2	2	1	1	1	2	9
Facetune	36%		2	4	2	2		4	14
Amazon Alexa	36%		4	2		1	1	2	10
Google Assistant	36%	1	2	2			1	2	8
PhotoRoom	29%		1	1		1		1	4
Duolingo	21%			2		1		1	4
StarryAI	14%		2					1	3
Bing	14%			1			1		2
Lensa	14%			1		1			2
Otter	7%		2						2
Youper	7%							1	1
Poe	7%			1					1
Pixai	7%							1	1
Speechify	7%			1					1
Wombo	7%							1	1

Canva also tops the chart for AI apps collecting user data for their own advertising or marketing purposes. To do so, Canva collects around 43% of their users’ data.

In third place, Amazon Alexa collects 36% of your data for the same purpose. This includes your email address, physical address, phone number, search history, and purchase history, plus five other data points. Google Assistant collects and shares the same percentage of data for this reason, though across eight individual data points, compared to the ten that Amazon Alexa collects.

The text-to-speech voice generator, Speechify, is among the apps that collect the least data. According to its Apple App Store listing’s privacy labels, Speechify collects just one data point for its own benefit; your device ID.

AI apps that collect your data for any purpose

Column headers with buttons are sortable.
App	% data collected	Browsing History	Contact Info	Contacts	Diagnostics	Financial Info	Health & Fitness	Identifiers	Location	Other Data	Purchases	Search History	Sensitive Info	Usage Data	User Content	No. of data points collected
Amazon Alexa	93%		24	4	9	3	4	10	8	4	5	5	4	13	23	116
Google Assistant	86%	4	8	2	6	1		8	5	2	1	5		8	8	58
Duolingo	79%		10	1	7	1		12	4	4	6	1		7	7	60
Canva	64%		11		3	1		8	5		4	5		10	6	53
Otter	57%		7	3	5			7	2			3		2	11	40
Poe	57%		2	2	3			6			2	3		2	5	25
Facetune	50%		6		8			18	8		8			14	2	64
Bing	50%		1		2			6	3			3		2	3	20
DeepSeek	50%		2		3			4	1			1		2	3	16
Mem	43%		6		4			6				6		6	4	32
ELSA Speak	43%		2		6			6			3			3	3	23
PhotoRoom	43%		2		1			9			3			4	1	20
Trint	43%	1	2		1			4						1	2	11
ChatGPT	36%		4		8			5						7	2	26
Perplexity AI	36%				6			6	2		1			6		21

1 / 2

All AI models require some form of training through machine learning — meaning that they need data.

If we want AI tools to improve and become more useful, our privacy can be seen as a necessary trade-off against providing this data.

However, the question of where the line between utility and exploitation should be drawn, and why, is a thorny one.

Given its current notoriety, it’s worth addressing DeepSeek. Its listing on the Apple App Store states that DeepSeek doesn’t collect user data for its own benefit (for example, DeepSeek’s own marketing and advertising) or to share with third parties.

But it’s worth pointing out that their Privacy Policy states otherwise (more on this later).

The DeepSeek app itself collects 50% of its users’ data, which serves DeepSeek’s Analytics and App Functionality. For comparison, the ChatGPT app collects 36%.

Some media outlets report concerns about security risks related to DeepSeek’s Chinese origins (both in terms of data collection and the possible spread of misinformation) and the undercutting of US rivals. Both are unlikely to be alleviated by DeepSeek’s Terms and Conditions and Privacy Policy, which would take around 35 minutes to read, and are rated as “very difficult” on the Flesch-Kincaid readability scale.

Regardless of how your data is used, Amazon Alexa collects more of its users’ data than any other AI app included in this research. Overall, it collects 93% of your data (or 116 individual metrics, primarily contact info, user content, and usage data).

Google Assistant comes next, collecting 86%, followed by Duolingo, which collects 79%.

At the other end of the scale, AI image generator, Stable Diffusion, does not collect any data. That’s according to privacy labels on its Apple App Store listing.

While it’s true that all generative AI models require massive amounts of data to be trained, this training happens prior to the development of specific apps. In most cases, app creators don’t own the AI models they use; user data collection therefore relates to the functionality of the app itself. This may explain why some of the apps we’ve investigated have no information in the above table.

Now, let’s look at the legal documentation behind different AI tools to find out how easy or difficult they are to read. This is based on the Flesch-Kincaid reading-grade-level test.

This system equates texts to US school reading levels (from fifth to 12th grade), then College, College Graduate, and Professional. Sixth grade-level texts are defined as “conversational English for consumers”, whereas professional-rated texts are described as “extremely difficult to read”.

The lower the readability score, the harder the text is to read.

Column headers with buttons are sortable.

Clipchamp	3 hours 16 minutes	27.2	🤯🤯🤯🤯🤯🤯🤯
Bing	2 hours 20 minutes	35.4	🤯🤯🤯🤯🤯
Veed.io	2 hours 15 minutes	37.9	🤯🤯🤯🤯🤯
Facetune	2 hours 4 minutes	34.4	🤯🤯🤯🤯🤯🤯
TheB.AI	1 hours 47 minutes	31.4	🤯🤯🤯🤯🤯🤯
Otter	1 hours 11 minutes	32.4	🤯🤯🤯🤯🤯🤯
Jasper	1 hours 9 minutes	22.0	🤯🤯🤯🤯🤯🤯🤯🤯
Gamma	1 hours 6 minutes	30.9	🤯🤯🤯🤯🤯🤯
Speechify	1 hours 6 minutes	35.0	🤯🤯🤯🤯🤯🤯
Runway	1 hours 2 minutes	28.3	🤯🤯🤯🤯🤯🤯🤯

1 / 9

In 2023, British business insurance company Superscript polled the owners of 500 small and medium-sized enterprises (SMEs) to find out what consumes the most of their time.

Tellingly, ‘getting enough sleep’ — crucial for physical and mental health and cognitive function — ranks third, trailing ‘working long hours’ and ‘sorting out tax returns’.

A third of those polled felt that it wasn’t possible to do all of their admin during working hours, and said they needed four extra hours a day to get through it all.

This gives a sense of how punishing it can be to run an SME, and that the time needed to read the terms and conditions behind the tools they rely on is not easy to come by.

In this context, the 40-minute read-times of the T&Cs for transcribing tools like Otter, Trint, and Descript, is highly consequential.

And that’s assuming it’s possible to understand the most hard-to-read terms and conditions. This is why we sought the expertise of a legal expert.

We asked a legal expert in AI and tech to read them and explain key points you need to know

Josilda Ndreaj, a legal professional and licensed attorney, has navigated complex legal matters on behalf of Fortune 500 clients and provided counsel to various corporations.

More recently, as an independent consultant, she has focused on intellectual property law at the intersection of blockchain technology and artificial intelligence.

Josilda Ndreaj (LLM) is a legal professional and licensed attorney with expertise in Intellectual Property (IP) law.

Her career as a legal consultant began in a prestigious international law firm, catering to Fortune 500 clients. Here, Josilda navigated complex legal matters and provided counsel to different corporations.

Driven by interests in innovation, creativity, and emerging technologies, Josilda then ventured into independent consultancy and focused on Intellectual Property law, by covering the intersection with blockchain technology and artificial intelligence.

Josilda holds two Master of Law degrees; one specializing in civil and commercial Law from Tirana University, and the other focusing on intellectual property law, from Zhongnan University of Economics and Law.

As such, Josilda was uniquely positioned to review a selection of these AI tools’ legal documents, pulling out key points for the benefit of those of us who don’t hold two Master of Laws degrees.

Her summaries are outlined below:

Plagiarism and copyright infringement

Gemini (formerly Bard) has no obligation to declare sources of training, so we can’t check if it’s trained on copyrighted materials. Gemini isn’t excluded from liabilities of such infringement; if a copyright owner presses charges, Gemini bears some responsibility. But it’s important to note that Gemini is trained on what the user gives it. For this, Gemini requires a license from the user. If the user agrees to grant that license to Gemini, but they (the user) don’t actually own the copyright, the responsibility shifts to the user.

Users retain ownership of their inputs and prompts, but output ownership is more complex. Gemini doesn’t make this clear in its Terms. Many legislations don’t recognize intellectual property rights of machines. It is, however, questionable to argue the output is “human-generated,” even if the user owns the input.

Accuracy and reliability

Always use discretion when relying on Gemini for accurate information. Its Terms of Service state that accuracy should not be automatically assumed by the user. Therefore, you should fact-check all content generated from Gemini.

Business owners should never publish an output from Gemini without cross-referencing, reviewing for updates, and checking with experts for accuracy. Otherwise, they run the risk of publishing misleading information, which may carry reputational or legal consequences.

Security and confidentiality

Google (the owner of Gemini) provides no information in its Privacy Policy on how it handles confidential data.

Usage

Google states nothing in its Terms of Services on whether content generated by Gemini can be used for commercial purposes. It explains restrictions for things like intellectual property rights but nothing specifically for AI-generated content.

Plagiarism and copyright infringement

Currently, no legislation requires ChatGPT to publicly declare what its model is trained on. So, because it doesn’t reveal its sources, we can’t know if ChatGPT delivers or processes content that is protected by copyright laws. If someone identifies copyrighted content from ChatGPT, they can make a claim to remove that content.

ChatGPT’s Terms state that user input, even if copyrighted, is used to improve the service and train the model; and the user can’t opt out of this. ChatGPT’s Terms say that a user owns all input and output content (if the input content is already used legally), but that doesn’t guarantee that the user owns the copyright to that input and output content. In addition, users can’t pass off anything from ChatGPT as ‘human-generated’.

Accuracy and reliability

Users should verify all information from ChatGPT. That’s because ChatGPT bears no responsibility to provide accurate, up-to-date content. According to its Disclaimer of Warranty section, the user takes on all risks of accuracy, quality, reliability, security, and completeness. Therefore, always verify facts from ChatGPT; cross-reference, review for updates, and check with experts for accuracy, too. Business owners may face legal or reputational consequences if they don’t verify ChatGPT content for accuracy before publication.

Security and confidentiality

ChatGPT collects information from inputs — including personal information — to potentially train its models (according to the Privacy Policy). The user can opt out. The situation changes if data gets submitted through API connections (ChatGPT Enterprise, Team, etc); that’s because ChatGPT doesn’t use inputs from business customers to train models. ChatGPT has security measures in place, but doesn’t explicitly address responsibility for a security breach. It all depends on regional laws.

Usage

ChatGPT users own their input and output content; the users, therefore, must ensure the content doesn’t violate any laws. Users can’t claim the content is human-generated, but you don’t have to say it’s AI-generated either. As long as users follow regional laws and the Terms of Use, ChatGPT content can be used for commercial purposes, on social media, for paid advertising, and other channels. It’s advised you fact-check, make references, and abide by laws before publishing content from ChatGPT.

Plagiarism and copyright infringement

Neither the Privacy Policy nor the Terms specify whether DeepSeek’s AI tool has been trained on copyrighted materials. What’s more, they also provide no guarantees that the outputs will not infringe on anyone’s copyright. DeepSeek’s Terms of Use state that users retain rights to their inputs (prompts), but this doesn’t necessarily imply that they’re copyright protected in the first place, so users should take steps to ensure that what they’re using as prompts isn’t someone else’s intellectual property.

The Terms also explicitly state that the user is responsible for all inputs and corresponding outputs, so with that in mind, users should independently verify these by seeking legal advice and/or using copyright databases. For example, the US Copyright Office Public Catalog allows users to search for registered works in the US, while Google’s Reverse Image Search helps check whether an image has been previously published or copyrighted.

Accuracy and reliability

DeepSeek’s Terms of Use say that outputs are AI-generated, and therefore may contain errors and/or omissions. It specifically refers to medical, legal, and financial issues that users may ask DeepSeek about, and how DeepSeek’s outputs don’t constitute advice. But more broadly, it says the same about ‘professional issues’, which could be just about anything — from marketing and research to technology and education and beyond — and that outputs don’t ‘represent the opinions of any professional field.’ Users should therefore independently verify all of DeepSeek’s outputs for accuracy.

Security and confidentiality

DeepSeek’s Privacy Policy explains that user inputs are processed to generate outputs, but also to improve DeepSeek’s service. This includes ‘training and improving [their] technology’. Users should therefore be cautious about inputting sensitive information, and even though DeepSeek has ‘commercially reasonable’ measures in place to protect the data and information used as inputs, it doesn’t provide any absolute guarantees. DeepSeek’s terms state that they don’t publish input or outputs in public forums, but some may be shared with third parties.

Usage

Any content that users generate through DeepSeek can be used for commercial purposes, but because of gray areas around plagiarism and accuracy, users should take steps to verify the content before using it in this way. DeepSeek’s Terms of Service don’t reference any limitation regarding where in the world users can publish this content, but they clearly state that users must declare it as AI-generated ‘to alert the public to the synthetic nature of the content’.

Plagiarism and copyright infringement

Similar to ChatGPT, DALL-E does not declare sources for its model training. If, however, you find copyrighted content, you can submit a claim for removal. It’s difficult to check if DALL-E infringes upon a copyright since no legislation requires DALL-E to reveal its data sources. User input, according to the Terms, can be used to train DALL-E’s model — even if it’s copyrighted content. The user may opt out of this.

DALL-E users own input and output content (not DALL-E), but that doesn’t mean the user has a copyright on said content. Also, users can’t claim DALL-E content is human-generated.

Accuracy and reliability

DALL-E bears no responsibility for inaccurate content. Its Terms pass all liability onto the user. These liabilities include content reliability, accuracy, quality, security, and completeness. So, before publishing, users should always cross-reference, review for updates, verify facts, and check with experts, since publishing false information may lead to reputational or legal repercussions.

Security and confidentiality

The Privacy Policy and Terms and Conditions from DALL-E never explicitly address responsibility in the event of a security breach. DALL-E does have security measures in place, though. Who bears the responsibility in the event of a hack depends on regional laws.

DALL-E collects a user’s input for model training, even personal information, but you can opt out of this. With the API, however, DALL-E handles input data differently; it doesn’t use data from business users to train its models.

Usage

The user owns DALL-E input and output content; they must also ensure that content doesn’t violate any laws or the DALL-E Terms. You can’t claim content from DALL-E is human-generated, but nor do you have to say it’s AI-generated.

You can use DALL-E for commercial purposes, as long as you follow all laws and the DALL-E Terms. Regulations may change, but at the time of writing, users are welcome to publish DALL-E content on social media, in advertisements, and on other channels. Users should always make proper references and fact-check accuracy to avoid violating any laws.

Plagiarism and copyright infringement

Bing has no obligation to share its data training sources, making it very difficult to figure out if Bing AI inadvertently uses copyrighted content. Although tricky to identify, users can make claims on copyrighted content. The Microsoft Services Agreement says Bing AI takes user inputs and outputs to improve its model, but there’s nothing formal in place to prevent intellectual property theft.

The Services Agreement says users grant Microsoft a worldwide and royalty-free intellectual property license to use your content. Furthermore, Bing/Microsoft takes user inputs to train AI models.

Accuracy and reliability

Microsoft offers no guarantees for the accuracy or timeliness of content from its AI services. Therefore, you should fact-check everything from Microsoft’s AI tools, especially if you’re a business owner who wants to avoid any reputational and legal consequences.

Security and confidentiality

Microsoft’s AI tools (including Bing AI) use personal and confidential user data to train its models. Its Service Agreement does not cover AI-generated content; instead, it tries to shift all AI content responsibility to the user. Microsoft also assumes no responsibility for its customers’ privacy and security practices. In short, if your data gets breached while using Bing AI, it’s your problem, not Microsoft’s.

Usage

Microsoft does not claim ownership of user content, but it doesn’t specifically regulate AI-generated content, where ownership is uncertain. The Services Agreement lets people use content for commercial purposes, with some significant stipulations: you must accept that AI-generated content lacks human creativity, so it can’t be claimed as intellectual property; you must also not infringe upon the intellectual property rights of others. In short, you can’t use intellectual property from others, but whatever you make with Bing is probably not your own intellectual property.

Plagiarism and copyright infringement

Quillbot has no obligation to reveal sources it uses to train models. However, the company interestingly tries to regulate one unique situation: what if the source of model training is the AI’s output? Quillbot essentially attempts to minimize the potential for copyrighted output, but states there’s still a chance output is copyrighted if users input copyrighted content. To make things more confusing, Quillbot tries to cover all areas by saying users grant Quillbot an unlimited, sub-licensable license while also claiming users own all of their outputs.

Accuracy and reliability

Quillbot bears no responsibility for accuracy or reliability. Users should be very cautious and verify all facts from Quillbot.

Security and confidentiality

Quillbot has measures to protect user privacy, but it may still end up processing personal data. There are special protections for children’s privacy. Responsibility for data loss from a hack is handled on a case-by-case basis. Quillbot states the user should take steps to prevent their personal data from being hacked, and that Quillbot has data protection elements in place.

Usage

Quillbot users can publish generated content for commercial purposes, but you may need to follow some rules, like not publishing harmful or misleading content. Quillbot’s Terms don’t say that you need to declare its content is AI-generated. In short, the user can publish content generated by Quillbot as long as it doesn’t violate any laws or rights.

Plagiarism and copyright infringement

Pixlr doesn’t reveal its sources for AI model training, since there’s no legal obligation for them to do so. Its Terms state the user owns the content, but users also grant a license to Pixlr to use the content. This is an attempt to minimize the usage of copyrighted content.

Accuracy and reliability

Pixlr bears no responsibility for inaccuracies in its AI-generated content. Users should be careful to fact-check everything from Pixlr.

Security and confidentiality

Pixlr takes user inputs for AI model training. It passes the burden to the user to be careful about inputting personal or confidential information. Pixlr waives its responsibility to filter some information from its training data, though it does use some filters to block personal or confidential information. Pixlr claims no liability for security issues caused by users’ actions.

Usage

Users can publish AI-generated content made through Pixlr for commercial purposes (though some conditions apply). The Terms don’t require you to state anything is AI-generated. Users are still liable for violating rights or laws, though.

Plagiarism and copyright infringement

Midjourney users grant the company rights to all assets made through it, but the user also owns all the assets. There are exceptions, like how you can’t claim ownership if you just upscaled an image from an original creator.

Accuracy and reliability

Midjourney bears no responsibility for the reliability and accuracy of its content. Users, therefore, should take extra caution to verify facts and check reliability.

Security and confidentiality

Midjourney trains its model with user inputs, even if it includes personal or confidential data. Midjourney claims the user should be careful with sensitive data, so it’s not their issue. The company attempts to filter out certain information for model training, but it’s not required. Midjourney claims no responsibility for security issues that may occur from a user’s actions.

Usage

Midjourney users can publish generated content for commercial purposes. Some conditions, like the requirement to subscribe to the Pro version if the company makes more than $1M per year, apply. At the time of writing, users don’t have to claim anything is AI-generated from Midjourney, even though legislation is in motion to change this. Users can generally use any Modjourney content if it doesn’t violate any rights or laws.

Plagiarism and copyright infringement

Clipchamp is a Microsoft product, so it claims no obligation to share data sources for model training. Therefore, it’s hard to tell if the AI tool inadvertently uses copyrighted data. Someone must identify the copyright infringement to make a claim, which is unlikely. The Microsoft Services Agreement says it can use user inputs and outputs to improve services, and Microsoft doesn’t regulate intellectual property matters. Also, users grant a worldwide, royalty-free intellectual property license to Microsoft.

Accuracy and reliability

Clipchamp and Microsoft guarantee nothing in terms of the accuracy or timeliness of its AI tools, so users should verify all facts before publication. This is particularly true for business owners who may face legal and reputational consequences.

Security and confidentiality

Clipchamp and Microsoft use personal and confidential data for model training. Microsoft tries to shift responsibility for all content issues to the user. The same goes for hacking and security breaches; it’s the user’s responsibility if anything happens.

Usage

Clipchamp and Microsoft steer away from regulating AI-generated content, never claiming that Microsoft owns the content. Technically, Microsoft says the user owns it but without the intellectual property rights. The user can publish Clipchamp content for commercial purposes with two stipulations: you can’t infringe on intellectual property rights, or claim you have intellectual property rights for generated content.

Plagiarism and copyright infringement

The Looka Terms state the company has no obligations to share data training sources, so they don’t. Users bear all risks when using Looka-generated content.

Accuracy and reliability

Looka accepts no responsibility for the accuracy and reliability of the output from its AI tools. Users should verify all facts and check for reliability.

Security and confidentiality

Looka uses input data — even potentially personal and confidential data — to train its model, so the user is responsible for not using this type of data in their input. Looka tries to filter such data, but they claim they don’t have to. In addition, all security issues are the user’s burden.

Usage

Looka users may use AI-generated content for commercial purposes, but they may need to follow conditions or pay a fee. Users don’t have to label their generated content as AI-generated. Users should avoid publishing generated content that violates rights or laws.

Plagiarism and copyright infringement

It’s not possible to tell if Speechify trains its model on copyrighted materials. We simply don’t know. Speechify’s Terms recommend not using copyrighted material for inputs, which suggests that some outputs may have copyrighted data. Speechify claims to bear no responsibility for this.

Accuracy and reliability

Speechify, according to its Terms, takes no responsibility for the accuracy of its outputs. Users should always check for timeliness, reliability, and accuracy with Speechify.

Security and confidentiality

Speechify passes all responsibility to the user when it comes to inputting sensitive or personal data. The tool does use inputs to train models, so users should be careful. Speechify doesn’t necessarily have to filter sensitive or personal data, but it tries to with some filters. All responsibilities for security issues get passed to the user.

Usage

Speechify users may use outputs for commercial purposes, with some conditions like having to subscribe or get licensing. Users don’t have to declare that Speechify content is AI-generated. Speechify users, however, should never violate rights or laws when publishing Speechify outputs.

Plagiarism and copyright infringement

The Kapwing Terms place all intellectual property and accuracy responsibilities on the user. All users grant Kapwing a non-exclusive license, but the users retain ownership of the content. Kapwing discloses nothing about whether it uses input data for model training.

Accuracy and reliability

Kapwing puts all accuracy responsibilities on the user, especially any legal and reputational consequences.

Security and confidentiality

Kapwing users take on all risks when choosing to input confidential information into its AI tool. It also offers no warranty or responsibility over the security of the service.

Usage

You can publish Kapwing content commercially, but Kapwing advises users to be cautious. Their terms don’t say whether or not users must declare output from Kapwing is AI-generated.

Disclaimer

This information is for general information purposes only and should not be taken as legal advice. Ecommerce Platforms assumes no responsibility for errors or omissions. Consult a suitable legal professional for advice and guidance tailored to your specific needs and circumstances.

Conclusion

The ubiquity of AI makes it increasingly likely for us all to use tools and apps based around this technology — yet many of us don’t have the luxury of the time needed to read their terms and conditions.

In 2017, professional services network Deloitte found that 91% of consumers “willingly accept legal terms and conditions without reading them” — rising to 97% between the ages of 18 to 34.

Given how many AI T&Cs we rated with low readability scores, it seems that the impenetrability of these documents’ legalese puts users off from even attempting to understand them.

We worked with a legal professional to parse the documents for us, but it’s questionable whether this should be necessary.

We hope that this research — including its readability ratings, and Josilda Ndreaj’s expertise on the terms and conditions to be mindful of — will help guide your choices of which AI apps and tools to engage with.

Methodology and Sources

How we conducted the research

Starting with a seed list of around 90 AI tools and apps, we first gathered each tool’s legal documentation, from terms and conditions to privacy policies. We then recorded the word lengths of these documents, and calculated their readability score using Flesch-Kincaid’s grading system. Next, we enlisted the help of a legal expert, Josilda Ndreaj (LLM), who reviewed a selection of these legal documents and identified key points that users should be aware of.

For around 30 of the AI tools that have mobile app versions available, we searched each on the Apple App Store and recorded their privacy labels shown on their listings. These are divided into 14 categories of data that can be collected from users, and for what purpose. To calculate which AI apps collected the most data, we measured how many of the 14 possible categories these apps tracked their users across.

It’s important to note that these 14 categories are divided further into individual data points. For example, ‘Contact Info’ includes five data points, which are; ‘Name’, ‘Email Address’, ‘Phone Number’, ‘Physical Address’ and ‘Other User Contact Info’. To find out which apps collect the most individual data points, take a look at the last column in each of the tables.

Some apps will collect more individual data points than those appearing higher in the ranking. This is because our ranking methodology considers which apps collect data across the most categories overall, suggesting a broader and therefore more ‘complete’ picture of user data, rather than the depth of information they collect in each category.

Sources

Apple App Store pages for each app, accurate as of February 2025.
Various documentation for each AI app (including terms and conditions and privacy policies) accessed and reviewed by Josilda Ndreaj in February 2024, except DeepSeek, which was accessed and reviewed in January 2025.
Flesch-Kincaid Readability calculator.
Word count scraper.
Various roundups to inform the initial seed list of AI apps and tools, including:

Correction requests

We periodically update this research.

If you are the owner of any of the AI tools included in this research and you’d like to challenge the information on this page, we’re willing to update it subject to our review of the evidence you provide. When contacting us in this regard, we kindly ask for:

business documents verifying your legitimacy (for example, incorporation certificate or registration documents)
the information on this page you believe to be outdated (please be specific)
how it should be updated and why, with links to documentation that backs this up (for example, amendments to Terms of Service)

Please contact us at [email protected] with the subject line: ‘Correction request: AI tools study’, plus the name of the AI tool you’re contacting us about.

Design | Web Development

Behind the fine print: Understanding AI apps and privacy

10+ Cool Illustration Tools for Designers to Use in 2021

WordPressers Making a Splash

What ELSE is on your CSS wishlist?

21 Black Friday Marketing Strategies to Maximize Your Profits

10+ Cool Illustration Tools for Designers to Use in 2021

WordPressers Making a Splash

What ELSE is on your CSS wishlist?

21 Black Friday Marketing Strategies to Maximize Your Profits

10+ Cool Illustration Tools for Designers to Use in 2021

WordPressers Making a Splash

What ELSE is on your CSS wishlist?

21 Black Friday Marketing Strategies to Maximize Your Profits

10+ Cool Illustration Tools for Designers to Use in 2021

WordPressers Making a Splash

Fast Domains

Business Hosting

PC Fusion

Linux Park

When you use an AI app, you consent to (at least some of) your data being collected by it

What data do these AI apps collect?

Why do these AI apps collect data?

AI apps that collect your data to share with third-party advertisers

AI apps that collect your data for their own benefit

AI apps that collect your data for any purpose

We asked a legal expert in AI and tech to read them and explain key points you need to know

Plagiarism and copyright infringement

Accuracy and reliability

Security and confidentiality

Usage

Plagiarism and copyright infringement

Accuracy and reliability

Security and confidentiality

Usage

Plagiarism and copyright infringement

Accuracy and reliability

Security and confidentiality

Usage

Plagiarism and copyright infringement

Accuracy and reliability

Security and confidentiality

Usage

Plagiarism and copyright infringement

Accuracy and reliability

Security and confidentiality

Usage

Plagiarism and copyright infringement

Accuracy and reliability

Security and confidentiality

Usage

Plagiarism and copyright infringement

Accuracy and reliability

Security and confidentiality

Usage

Plagiarism and copyright infringement

Accuracy and reliability

Security and confidentiality

Usage

Plagiarism and copyright infringement

Accuracy and reliability

Security and confidentiality

Usage

Plagiarism and copyright infringement

Accuracy and reliability

Security and confidentiality

Usage

Plagiarism and copyright infringement

Accuracy and reliability

Security and confidentiality

Usage

Plagiarism and copyright infringement

Accuracy and reliability

Security and confidentiality

Usage

Conclusion

Methodology and Sources

How we conducted the research

Sources

Correction requests

Similar Posts