Talk:ORES

Jump to: navigation, search

About this board

This talk page is intended to be used exclusively to discusse the implementation of ORES. For any discussion about the bots/tools that use ORES, please direct them to the respective talk page.

Townie (talkcontribs)

There was recently a suggestion in the Catalan Wikipedia to start using ORES again after months of inactivity. We have an edit type campaign, but there haven't been any damaging-goodfaith campaigns, which seems unusual according to ORES/Get support. Furthermore, we do not seem to have language support for Catalan. Are we on the right track? What should we do to reactivate the labelling project? Thanks in advance.

This comment was hidden by Townie (history)
Ladsgroup (talkcontribs)

As far as I can see the labeling campaign is 8% done: http://labels.wmflabs.org/stats/cawiki/. In order to label more go to http://labels.wmflabs.org/ui/cawiki/

Townie (talkcontribs)

@Ladsgroup: So should we continue and end this campaign, or ask for language support first?

Ladsgroup (talkcontribs)

Both would be great and needed for the advanced support, we already have the generated list for Catalan: m:Research:Revision scoring as a service/Word lists/ca So review of that would be awesome (See ORES/BWDS review for more info)

EpochFail (talkcontribs)

@Townie: just checked up on this and it seems like we're blocked on you or some other Catalan speaker reviewing the word lists Ladsgroup linked to above.

Townie (talkcontribs)

@EpochFail: Thanks a lot! I will try to fill the lists as soon as possible.

Townie (talkcontribs)

@EpochFail: I think I'm done with the bad words list and the informal one. Many of the words in the generated list were neither informal nor bad words, simply words which vandals use without being any type of slur. I left them there, feel free to take them out if necessary.

EpochFail (talkcontribs)

Townie, that's perfect. Indeed, our "BWDS" script picks up a lot of non-bad words that vandals use. Your help in filtering them out is greatly appreciated. :)

Reply to "Catalan Wikipedia"
Xinbenlv (talkcontribs)

Does ORES has a recommended way to set header so that we can identify ourselves when calling this API? (and get higher QPS limit)? Thank you!

EpochFail (talkcontribs)

Yes. Please include an email address and some description of your project in the User-agent string. See API:Main_page#Identifying_your_client for some tips on what to include in a good User-agent.

Reply to "Recommended to set request header"

"El Real Colegio Convictorio de Nuestra Señora de Monserrat"

2
Summary by EpochFail

Not the right place to raise this concern. Please use the talk page.

Paraquaria (talkcontribs)

Realicé una modificación en "El Real Colegio Convictorio de Nuestra Señora de Monserrat" referida al Virreinato del Río de la Plata.La referencia a ese territorio es errónea porque el Virreinato del Río de la Plata es de 1776 y el colegio fue creado en 1687, cuando los territorios correspondían al Virreinato del Río de la Plata.

EpochFail (talkcontribs)

I'm sorry but this isn't the right place to raise your concern. Please consider editing the talk page to comment on content of a specific article.

Summary by EpochFail

Not the right place to raise this concern. Please use the talk page.

Arjd1977 (talkcontribs)

I have trying to update the page, and write some names of sons and datters to Maria de Garay (datter to Juan de Garay "el mozo") that were not in the text before. But the changes has been deleted.

EpochFail (talkcontribs)

Hi Arjd1977, I'm sorry but this isn't the right place to raise your concern. Please consider editing the talk page to comment on content of a specific article.

Summary by EpochFail

Not the right place to ask

Jsanchezsantonja (talkcontribs)

I´have been trying to update the page of Setter Motocicletas in Spain, and I´ve been rejected... I can update and include all new information about Setter brand, own of my family.

What can we do? thanks

Jose

EpochFail (talkcontribs)

I'm sorry to say that this is not the right place to ask.

EpochFail (talkcontribs)

I'm sorry to say that this is not the right place to ask.

Difference between EndPoints in the Sample Queries

3
Xinbenlv (talkcontribs)

Hi ORES team, what's the difference between EndPoint:http://ores.wmflabs.org/v3/scores and https://ores.wikimedia.org/v3/scores in the sample query section?

One is under wmflabs.org, the other is under wikimedia.org

EpochFail (talkcontribs)

Luckily, we're just finishing up an FAQ for ORES. See ORES/FAQ#What deployments of ORES are there? for what you are looking for. :)

This comment was hidden by EpochFail (history)
Reply to "Difference between EndPoints in the Sample Queries"
181.56.91.4 (talkcontribs)

I made and edition inserting the link to the Nikki's wikipedia page. Nikki was part of the program and the link wasn't wrong nor had a bad intention.

EpochFail (talkcontribs)

I'm not sure what the question is.

40.132.158.86 (talkcontribs)

Yo creo que la informacion de Ednita Nazario (Cantante Puertoriqueña) debe de ser editada. Numero uno, la primera seccion deberia llamarse "Principios de Carreras" y no "comienzo de su carrera" porque se ve mas formal. Segundo, dividir las secciones por cada 5 años ya que ella tiene muchos discos y no se le esta dando suficiente atencion a cada uno. Tercero, cambiar los headers con los años (por ejemplo "1970-1975- seguido por los nombres de los albums"). Cuarto, editar el ultimo header para añadir su nuevo album "Una Vida" y su Autobiografia. Quinto, editar la seccion de Desnuda porque el link es hacia otra cosa que no es su album. Intento editarlo y cada vez que lo hago el bot me lo altera. Gracias

Adamw (talkcontribs)

I see this was also posted to the correct page, https://es.wikipedia.org/wiki/Discusión:Ednita_Nazario

Thank you for your interest!

Is there a correlation between the editing environment and draftquality?

10
Whatamidoing (WMF) (talkcontribs)

I'd like to know whether draftquality is the same for new accounts that create new articles in the visual editor as it is for new accounts that create new articles in the older wikitext editors (at the English Wikipedia).

@Nettrom, I'm assuming that this is outside the scope of your current projects. @Neil P. Quinn-WMF, is this something that you could do? I'm not sure how much work this would be, but I assume that it's not very difficult.

Alsee (talkcontribs)

I'm very interested in all VE research results. Please ping me if this project goes forwards.

Confounding variables of using different self-selected populations would produce unreliable results, especially because some percentage of those "new accounts" actually represent experienced editors. (Paid editors in particular abuse throw-away accounts for each new article.) However I have a fix. You can do a retroactive controlled study. You have to ignore whether an article was created using VE, and look for any difference in draftquality between the experimental and control groups of the May 2015 study of VisualEditor's effect on newly registered editors.

Comparing control group wikitext articles against experimental group wikitext+VE articles will cut your signal strength in half, but it's the only way to avoid junk data due to skewed population selection.

Whatamidoing (WMF) (talkcontribs)

Your idea might measure the added value for VisualEditor's contribution (e.g., if you were trying to show that a new editor using VisualEditor is more likely to properly format a citation), but that's not actually my goal. I'm thinking that the chosen editing environment might be a useful marker.

Alsee (talkcontribs)

There's obviously no value in collecting data showing that experienced editors produce higher quality drafts than new users.

That is likely what we would get if we ran your proposed data-collection without modification. An experienced user with a new-account is more likely to know how to switch to the secondary editor. This can introduce an experienced-user bias in the study's population-selection.

(Collecting reliable data from the wild isn't easy.)

Whatamidoing (WMF) (talkcontribs)

My proposal is to study an objective, non-speculative condition: "new accounts that create new articles".

Alsee (talkcontribs)

What value would that have, if it merely establishes that experienced users produce higher quality drafts than new users?

We can get much more valuable results by re-examining data from the controlled study.

Whatamidoing (WMF) (talkcontribs)

You are welcome to re-examine that old data if you want to.

You are also welcome to study whether new accounts that you believe to be experienced editors actually produce higher quality drafts. However, it sounds circular to me: How will you divide the brand-new accounts into "experienced" and "new" editors? By looking at the quality of the draft. What are you going to study? Whether the ones that you labeled "experienced", on the basis of their higher quality drafts, produced higher quality drafts than the ones that you labeled "new", on the basis of their lower quality drafts. If you did not find a perfect correlation in such a study, then you would probably want to look for an arithmetic error.

I do not want to discourage you from researching whatever interests you, but your question does not interest me.

Alsee (talkcontribs)

What??? Do you understand why you're going to get junk data?

For new accounts, you can't distinguish experienced editors from new editors. It's a confounding factor. You're proposing to use biased populations.

I also don't understand why you seem actively-averse to looking at high quality data.

Whatamidoing (WMF) (talkcontribs)

Again, I'm not trying to distinguish experienced editors from new editors.

I'm trying to find out whether new accounts (=an objective, unbiased, machine-identifiable state that is only partially correlated with the actual experience level of the humans who are using those accounts) and either use, or don't use, the newer editing software, produce the same or different results on the specific measure of ORES draftquality.

As a side note, it sounds like you're assuming that experienced editors are more likely to switch to visual editing than new editors. I don't think that there is any data to support your assumption.

Alsee (talkcontribs)

Hypothesis 1: Experienced users are more likely to know how to switch to VisualEditor. Draftquality for new-accounts using VisualEditor will skew high, because you're measuring more experienced editors in VE vs newbies in wikitext.

Hypothesis 2: Experienced editors overwhelmingly prefer wikitext. Draftquality for new-accounts using VisualEditor will skew towards 'suck', because you're measuring more experienced editors in wikitext vs newbies in VE.

I find it hard to imagine any valid use for the results when you don't know what you're measuring. I can however imagine some invalid uses for a collection of random numbers.

Edit: Perhaps it would aid my understanding if you identified how you wanted to use the data, rather defining the data to be collected. It's the intent here, which will help me understand if I'm mistaken.

Reply to "Is there a correlation between the editing environment and draftquality?"
EEggleston (WMF) (talkcontribs)

Is the github.com/wiki-ai page the right place to link?

EpochFail (talkcontribs)

It's not a bad place to link. We keep all of our primary repos within that organization.

This comment was hidden by Neil P. Quinn-WMF (history)
Reply to "Links to repos?"