Content translation/Machine Translation/MT Clients

Privacy Policy

Last Updated on 07 01, 2020

Your privacy is important to us. This Privacy Policy (“Policy”) applies to services provided by Hackernoon  (“we”, “us”, or “Company”) and our website, mobile app, or other platform (collectively, the “Services”) explains what information we collect from users of our Services (a “user”, “you”, or “your”), including information that may be used to personally identify you (“Personal Information”) and how we use it. We encourage you to read the details below. This Policy applies to any user or visitor to or user of our Services. We will be the controller of your personal data provided to, or collected by or for, or processed in connection with, our Services.

We reserve the right to change this Policy at any time. We will notify you of any changes to this Policy by sending notice to the primary email address specified in your account or by a prominent notice on our website. Significant changes will go into effect thirty (30) days following such notification. Non-material changes or clarifications will take effect immediately. You should periodically check our website, including this privacy page, for updates. You acknowledge that your continued use of our Services after we publish or send a notice about our changes to this Policy means that the collection, use and sharing of your Personal Information is subject to the updated Policy.

What Information Do We Collect?

We will only collect and process Personal Information about you where we have lawful bases. Lawful bases include consent (where you have given consent), contract (where processing is necessary for the performance of a contract with you (e.g. to deliver the Services you have requested)) and “legitimate interests”.

We collect information that you give us when you create an account, contact us, or otherwise access or use our Services, any information you voluntarily submit to us or publicly sourced information, and information regarding how you use the Services. Specifically, the foregoing includes:

Your internet protocol address (i.e., IP address) and, if you access the Services from a mobile application, your unique mobile device ID number and non-email authentication.

Your name, email address, postal address, employment information, credit card information, telephone number, profession, date of birth, profile picture, your responses to surveys that we may ask you to complete for research purposes or to help direct Company activities, the contact information of your representative, your social media account information

Details of any financial transactions you participate in on the Services, including the amount, currency, and method of payment.

Browser and device information and information collected through technologies such as cookies, pixel tags, and other technologies.

If you use our Services from a mobile device, that device will send us data about your location based on your phone settings. We will ask you to opt-in before we use GPS or other tools to identify your precise location.

How Do We Use The Information We Collect?

We use the information we collect to:

Deliver and improve the Services and your overall user experience.

To protect, investigate, and deter against fraudulent, unauthorized, or illegal activity.

To link or combine user information with other Personal Information.

To compare and verify information for accuracy and update our records.

Email, text, message, or otherwise contact you with information and updates about us and the Services

To send you information including confirmations, invoices, technical notices, updates, security alerts, and support and administrative messages.

Analyze how you use the Services with tools such as Google Analytics and other tools to help us understand traffic patterns and know if there are problems with the Services.

Create targeted advertising to promote the Services and engage our users.

Do We Share Your Personal Information?

We do not rent, sell, or share your Personal Information with other people or non-affiliated third parties except with your consent or as necessary to complete any transaction or provide any service you have requested or authorized.

To help us do our work, we may provide limited access to some of your Personal Information to the following third parties:

Partners: Sometimes we collaborate with other organizations to deliver the Services. In these cases, we may share your name, contact information and other details you provided when making an account with our partners.

Service Providers: We work with a wide range of third party providers, notably our database administrators, cloud computing services, advertising services, data analysts, application service providers, bulk SMS services, and other non-governmental organizations.

Payment processors:  We work with payment processors such as Stripe to help process credit card transactions and other payment methods made through the Services. These payment processors will store certain information about you. Please refer to their privacy policies to learn more about how they use your Personal Information.

We may store information such as names, email addresses, profession, employment information, credit card information, date of birth, telephone number, profile picture, user survey responses, third party social media account information (e.g. Facebook, Google), and postal address in our databases and Hubspot. We may store any information that is necessary to enable us to operate effectively and deliver our services to you. We may also transfer your Personal Information to a third party as a result of a merger, acquisition, reorganization or similar transaction; when required by law or to respond to legal process; to protect our customers; to protect lives; to maintain the security of the Services; and to protect the rights or property of Company. In such event, and to the extent legally permitted, we will notify you and, if there are material changes in relation to the processing of your Personal Information, give you an opportunity to consent to such changes.

Protection of Company and Others

We will also share Personal Information with companies, organizations or individuals outside of Company if we have a good-faith belief that access, use, preservation, or disclosure of your Personal Information is reasonably necessary to detect or protect against fraud or security issues, enforce our terms of use, meet any enforceable government request, defend against legal claims, or protect against harm our legal rights or safety.

How Do We Use Tracking Technologies?

A "cookie" is a small file placed on your hard drive by some of our web pages. We, or third parties we do business with, may use cookies to help us analyze our web page flow, customize our services, content and advertising, measure promotional effectiveness and promote trust and safety. Cookies are commonly used at most major transactional websites in much the same way we use them in our Services.

You may delete and block all cookies from our Services, but parts of the Services may not work. We want to be open about our cookie use.

Even if you are only browsing the Services, certain information (including computer and connection information, browser type and version, operating system and platform details, and the time of accessing the Services) is automatically collected about you. This information will be collected every time you access the Services and it will be used for the purposes outlined in this Privacy Policy.

You can reduce the information cookies collected from your device. An easy way of doing this is often to change the settings in your browser. If you do that you should know that (a) your use of the Services may be adversely affected (and possibly entirely prevented), (b) your experience of this and other sites that use cookies to enhance or personalize your experience may be adversely affected, and (c) you may not be presented with advertising that reflects the way that you use our and other sites. You find out how to make these changes to your browser at this site: Machine translation services are accessed using client modules in Content translation. We have Apertium and Yandex clients written already in the code. It is possible to add any number of such MT service clients and map to language pairs. This documentation explains the Machine client architecture.

Technical requirements
A new MT client can be a locally hosted machine translation system or a remote machine translation system accessed through API. API based services are recommended since that allows to isolate it as a service. If the client is free licensed and better packaged for Linux distros, we can consider hosting it in Wikimedia cluster. For example, Apertium is hosted inside wmflabs. On the other hand, Yandex is not hosted by Wikimedia. Bothe apertium and yandex are accessed using the web APIs.

Translation API
A machine translation API takes source language, target language, source content and outputs translated content. API must be publicly documented including the error codes.
 * If API is not public, it can accept an authentication token, mostly a key.
 * The output format can be JSON for convinience
 * API shoud accept POST
 * API should not demand any user identifiable information such as user name. CXServer does not provide it to MT Client.
 * API should be capable of accepting a reasonable number of requests per minute
 * API should accept a reasonable amount of content per request.
 * It is recommended to have a dashboard to analyse the usage of API including requests per day/week/month and Number of characters translated per day/week/month

Guidelines of performance
Content translation is still a beta feature, available only for opt-in logged in users. So the current usage pattern may not be the right assessment for future. Moreover, when we expand the machine translation to more languages, there will be more users and requests. Depending on our _current_ usage, some baselines are given below. Note that this is never going to be the final assessment. APIs must be designed to accept more than this.
 * Atleast 10,000 requests per day
 * Atleast 10 million characters per day
 * Atleast 5000 characters per request

Input format
The content to translate from CX is HTML formatted. Translating HTML while preserving markup is challenging, but some MT Engines are capable of that (example: Yandex). Apertium does not handle HTML markup. Depending on the capability, CX can send plain text version or HTML of the content.

Quality of translation
We evaluate the quality of MT by requesting feedback from Wikipedia contributors from the language in context. CX uses MT as an intial translation template and encourage translators to improve it. Because of that unless the quality is quite bad as per the feedback we get, we can use it.

Developing a new MT Client Module
The best way to learn this is to refer an existing client module like Yandex or Apertium. The client modules are present in cxserver's lib/mt folder. Let us call our client as BabelFish MT Client. Create a file named BabelFish.js in lib/mt folder

If your BabelFish service is not capable of translating HTML by retaining all markup in appropriate position in translation, instead of, you will have to write   method in the above code. Refer Apertium.js for such an example. Yandex.js is an example for MT client that is capable of handling both html and text content.

You need to add an entry in lib/mt/index.js for your new client.

To map a language pair to use this client, create a config file in config folder. You may refer exiting configuration files for examples. Then enable this MT engine in the cxserver config.yaml. Here also follow the existing entries for examples.

Restart the cxserver and test your client. You may want to read some unit tests existing for Apertium to write your own tests.