Talk:Citoid/2019
Add topic| This page used the Structured Discussions extension to give structured discussions. It has since been converted to wikitext, so the content and history here are only an approximation of what was actually displayed at the time these comments were made. |
Previous archives are at /Archive 1
Missing case in itemTypes
[edit]Hello, in Citoid/itemTypes the "case" type is completely missing there. Also there are multiple examples missing and some links even doesn't work. Could someone update the page a little? Dvorapa (talk) 18:11, 7 January 2019 (UTC)
- @Toniher@Mvolz ping Dvorapa (talk) 18:34, 7 January 2019 (UTC)
- I fixed a few things but we don't really maintain the page, another volunteer made it. Which links didn't work? Feel free to remove ones that don't. Mvolz (WMF) (talk) 15:11, 8 January 2019 (UTC)
- I will not remove any as it is better to find archive link than have no clue, what for example "hearing" stays for.
- The link for thesis doesn't work, but I think I understand this one's meaning. Dvorapa (talk) 16:52, 8 January 2019 (UTC)
- I've found https://www.zotero.org/support/kb/item_types_and_fields, which seems to be a good source too. Dvorapa (talk) 01:51, 31 January 2019 (UTC)
citation bot
[edit]How can we get User:Citation bot whitelisted it does high volume and exceeds policy -- they have their own Citoid install but it is not as good as the Wikipedia install which gives better results. GreenC (talk) 18:24, 8 January 2019 (UTC)
- To reduce load we do no query urls for citations that are already "complete". Since historically we are used in the sciences, we do not consider a citation complete without a volume number. We have added a short list of popular websites (CNN.com and such, it if purely my own writing based upon a few wiki pages) that do not have volumes and such to flag more as "complete". If we switch to Citoid, then it would be nice to get some feedback on what URLs websites are the most common, so we can add more to that last. AManWithNoPlan (talk) 18:46, 8 January 2019 (UTC)
- @Mobrovac-WMF Mvolz (WMF) (talk) 11:15, 9 January 2019 (UTC)
- Hm, there are no limits on the citation API end points AFAIK. Could you provide a sample response? Mobrovac-WMF (talk) 11:55, 9 January 2019 (UTC)
- en.wikipedia.org Citoid blocks us, so we run our own on the tool server. AManWithNoPlan (talk) 14:49, 9 January 2019 (UTC)
- Are there more than one endpoint?
- We tried https://en.wikipedia.org/api/rest_v1/#!/Citation/getCitation AManWithNoPlan (talk) 18:58, 12 January 2019 (UTC)
- No, that's the correct one... are you using the correct query pattern? Requests for restbase installs look like
- https://en.wikipedia.org/api/rest_v1/data/citation/mediawiki/http%3A%2F%2Fwww.example.com
- But if you're using a native citoid install they look like
- http://localhost:1970/api?format=mediawiki&search=http%3A%2F%2Fwww.example.com
- The former puts the params in the url, the latter uses query params. Mvolz (WMF) (talk) 10:59, 11 February 2019 (UTC)
- We got blocked so we now use our own on the tool servers. It does seem to time out at times also. AManWithNoPlan (talk) 17:07, 11 February 2019 (UTC)
- So I've asked operations, and we don't block IPs. However we do have rate limits of 1000 per 10 seconds (100 requests per second) which applies to all the mediawiki and restbase APIs. If you are receiving a 429 response, you just need to make sure you are adding a timeout in between requests so as not to exceed the limit.
- Is 429 the response code you are getting?
- (Also, it does take really long sometimes, depending on how long the time out is in your request package it may exceed it, so you may want to set a longer time out on your end.) Mvolz (talk) 18:05, 18 February 2019 (UTC)
- I will investigate. If it is 429, then we can sleep a little while and try again. The bot can be being run by lots of people so, we might hit the limit. How long of a time-out are we talking about? AManWithNoPlan (talk) 18:13, 18 February 2019 (UTC)
- The docs say "1000/10s (100/s long term, with 1000 burst)" - so I think if you exceed 1000 and wait 10s it should reset, but this is just from reading the docs.
- For timeout I tried to find an outside value for you; in tests we allow a request to take to take up to 40 seconds, but there have even been cases of something taking 75 seconds to return in the wild: https://phabricator.wikimedia.org/T165105#4666586
- And the caching layer sets a timeout of 360 seconds, so you will not get any responses that take longer than that. Mvolz (WMF) (talk) 12:24, 20 February 2019 (UTC)
We couldn't make a citation for you.
[edit]Hello. I have a problem with citoid. After I inserted url, doi, pmid,the error message is shown like this: "We couldn't make a citation for you. You can create one manually using the "Manual" tab above."
I think the "zotero transalation server" is functioning because 'c-url test' or 'npm test' result had no problem. And I had the problem in following citoid instellation manual. I cannot clone not REL1-29 but also REL1_31. The error message on my shell is "Remote branch REL1_31 not found in upstream origin.
If you want to do an anonymous checkout:
git clone https://gerrit.wikimedia.org/r/p/mediawiki/services/citoid
Like VisualEditor, the master branch requires alpha builds of MediaWiki. If you're installing on an other mediawiki version, use the right branch like git clone -b REL1_29 https://gerrit.wikimedia.org/r/p/mediawiki/services/citoid
I cannot find REL1_31 services/citoid
Thanks in advance.
ps. I'm not programmer, just i followed mediawiki manual. So my explanation of error is lack of information.
MediaWiki Certified by Bitnami 1.31.1-2 on Ubuntu 16.04 Timmy87 (talk) 00:52, 17 January 2019 (UTC)
- I checked chrome debugging, I found error message like this :
- "OPTIONS http://localhost:1970/api?action=query&format=mediawiki&search=123412345 net::ERR_CONNECTION_REFUSED"
- ----
- ----
- $ curl -X GET --header 'Accepapplication/json; charset=utf-8' 'http://localhost:1970/api?action=query&search=forhappywomen.com&format=mediawiki'
- But, this process is working. What's the problem...
Timmy87 (talk) 07:25, 18 January 2019 (UTC)- It sounds like a configuration issue on the wiki. Have you followed the directions hereCitoid#Configure Citoid on a Citoid-enabled wiki and verified that you are getting template data though following the directions here?: Citoid#Get "could not make a citation for you" every time Mvolz (talk) 12:12, 18 January 2019 (UTC)
- @Mvolz
- Thanks for answer! But I have followed those ways.
- But, It doesn't work.
- So I followed citoid again. But, I couldn't find the right branch of sevices/citoid (Not Extensions/citoid)
- I think my problem is I coudn't clone REL1_31. Could you explain how can I get REL1_31 sevices/citoid?
- (Although I changed following sentence 'REL1_29' to 'REL1_31', I see the error message :Cloning into 'citoid'...
- fatal: Remote branch REL1_31 not found in upstream origin)
- ----#Citoid page
- Like VisualEditor, the master branch requires alpha builds of MediaWiki. If you're installing on an other mediawiki version, use the right branch like
git clone -b REL1_29https://gerrit.wikimedia.org/r/p/mediawiki/services/citoid - ----
- ----I cannot find REL1_31
services/citoidTimmy87 (talk) 15:14, 20 January 2019 (UTC) - Sorry about that... the documentation is not correct :/ I've fixed it now. You need the Rel_31 version of the extension, not the service, i.e. https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/extensions/Citoid,branches
- The master version of the service should work with REL1_31, so I don't think that's the issue.
- What is the output of http://localhost/api.php?action=templatedata&titles=Template:Cite%20web/doc&format=jsonfm look like? Mvolz (talk) 15:50, 20 January 2019 (UTC)
- Mvolz
- Thanks for answer. I agree that. I solved problem with configuration of AWS security options. Acutally It's absolutely my mistake. I didnt open port1970 in AWS setting.
- I solved it and the url/citoid is well operating. But. Other function(DOI, ISBN, PMID...) have a problem :-( Timmy87 (talk) 05:10, 21 January 2019 (UTC)
- Unfortunately ISBN won't work anymore without a developer key from worldcat, since they shuttered their xisbn service a year or so ago :/. DOI and PMID should be working. Mvolz (talk) 09:33, 21 January 2019 (UTC)
- ah! I understood.
- Whenever I try to enter the pmid, I can see following message.
Lua error in Module:Citation/CS1/Identifiers at line 47: attempt to index field 'wikibase' (a nil value).- I can't find someone who struggled with same problem.
- PMID auto-search citation is what I really want to use and why I started mediawiki. :-( Timmy87 (talk) 23:33, 21 January 2019 (UTC)
- I solved <Lua error> with this page :Extension talk:Wikibase Client/Lua. Timmy87 (talk) 07:37, 22 January 2019 (UTC)
- Thanks! Other people have had the same issue so I posted here: w:Help_talk:Citation_Style_1#Errors_from_imported_CS1_templates_into_third_party_wikis_which_don't_have_wikibase
- It should be fixed in future versions of the template module. Mvolz (talk) 10:55, 6 February 2019 (UTC)
- FYI ISBN should be working now with the latest version citoid if you would like to update it: https://gerrit.wikimedia.org/r/#/c/mediawiki/services/citoid/+/486851/ Mvolz (WMF) (talk) 12:17, 18 February 2019 (UTC)
Bloomberg sites say "are you a robot?"
[edit]I just tried adding an article from https://www.bloomberg.com by citoid, but apparently bloomberg thinks citoid is a bot and asks for a captcha test. Is there anything Citoid developers can do? Or just up to bloomberg IT staff? Roy17 (talk) 21:36, 9 February 2019 (UTC)
- AFAIK the citoid query at external website declares itself to be Citoid, not a user agent (browser).
- One should not try to cheat by pretending to be a browser, since that will be discovered easily on the following dialogue.
- Yes, you are right, Bloomberg staff should make a silent exception, but that could be exploited by every other grabber then. PerfektesChaos (talk) 08:11, 16 February 2019 (UTC)
- Wouldn't it be great if we could send something like an "Accept: application/ld+json" header meaning that we only want the metadata and not the content? I guess webmasters may have an incentive to do that because it may save them some bandwidth, from crawlers and bots like us. I wonder whether someone has made it a standard already; I haven't checked to be honest.
- Anyways, if most websites already fail to embed metadata appropriately, I imagine most wouldn't implement something like this either! :/ Diegodlh (talk) 18:00, 6 June 2022 (UTC)
Statistics
[edit]The preamble mentions statistics for usage of Citoid. Where could I find this? I'm interested in usage on SVWP after we added translators for a few Swedish sites last fall. Sebastian Berlin (WMSE) (talk) 11:36, 8 March 2019 (UTC)
- So we just re-deployed citoid to a new infrastructure and the new stats board is here: https://grafana.wikimedia.org/d/000000011/service-citoid
- The only one with historical data is here: https://grafana.wikimedia.org/d/NJkCVermz/citoid
- We don't unfortunately have the data broken down by wiki. Also those stats aren't very accurate because they include incoming automatic requests. I think I removed it before but it was added back in :). Mvolz (WMF) (talk) 11:50, 8 March 2019 (UTC)
- https://grafana.wikimedia.org/d/000000011/service-citoid?panelId=4&fullscreen&orgId=1&from=now-30d&to=now doesn't seem to have data for anything except URLs, as of 10 days ago. The number of URLs reported at the same time. Do you know what happened? Whatamidoing (WMF) (talk) 19:43, 18 March 2019 (UTC)
- Nope. That seems like a bug.
- I know there have definitely been isbn inputs because I've submitted a few myself! Mvolz (WMF) (talk) 10:22, 21 March 2019 (UTC)
- Actually I think I got them reversed in my post, https://grafana.wikimedia.org/d/NJkCVermz/citoid?refresh=5m&orgId=1 is the current one. Mvolz (WMF) (talk) 11:30, 21 March 2019 (UTC)
- I wish that dashboard did some curve smoothing.
- Are we any closer to being able to say that citoid is used some thousands of times per day? Whatamidoing (WMF) (talk) 17:00, 29 March 2019 (UTC)
How can it work in private wiki in HTTPS
[edit]Unfortunately I'm not professional programmer or developer. So first of all, I apologize for my humble explation of my situation.
I want establish my own private wiki with SSL certificate in AWS EC2 instance.
After the setting of citoid server and zotero server, I passed the trial of citoid server/zotero server.
Even though the output of PMID search in web browser URL also showed normal response.
(MyDomain):1970/api?action=query&format=mediawiki&search=30863548
But, In my private wiki based on https proxy, citoid is not working in visual editor with message of "We couldn't make a citation for you. You can create one manually using the "Manual" tab above.". In the chrome debugging, they said like below. (I'm sorry to mask the domain name because of privacy. I re-checked the spell of domain name in configuration and there was no spell error.)
- Mixed Content: The page at (deleted to avoid abusefilter) was loaded over HTTPS, but requested an insecure XMLHttpRequest endpoint (deleted to avoid abusefilter). This request has been blocked; the content must be served over HTTPS.
I'm just curious about possibility of success. Is it the problem of cross site scripting? Then how can I do the cross site scripting. (I tried the $wgCrossSiteAJAXdomains configuration but it was not working)
Or do I have to customize citoid server to attach the SSL? If the cross-site scripting can cause the crucial security problem, then I'm willing to customize the citoid server. But I have no expertise about that. Please help me!!! VincentNo15 (talk) 15:28, 14 March 2019 (UTC)
- You need to set the protocol to https in your settings, i.e.
- $wgCitoidServiceUrl = 'https://localhost:1970/api';
- instead of
- $wgCitoidServiceUrl = 'http://localhost:1970/api'; Mvolz (WMF) (talk) 09:43, 15 March 2019 (UTC)
- Sorry for late response.
- I've already changed that part but in debugging mode, error message was printed.
- "CORS request did not succeed"
- Even though I'm not the professional developer, I think that I may have to change some configuration of Citoid server to passing the https proxy.
- Can you help me to solve this problem? :( 222.237.67.233 (talk) 16:08, 17 March 2019 (UTC)
- └ I forgot to login. I'm the same person who ask help.
- How do you think about using stunnel to get through? VincentNo15 (talk) 16:09, 17 March 2019 (UTC)
Please, return more data: PMC, PMID, ISSN, Publisher
[edit]The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
As I have asked in https://en.wikipedia.org/wiki/Wikipedia_talk:ProveIt#Automatic_retrieve_other_information:_PMID,_PMC,_ISSN_and_publisher.
If I do a DOI search, it returns me the articles' title, authors etc. But it won't return PMID, PMC, ISSN, publisher and other data that I can easily get manually.
It's possible to get PMID and PMC from the DOI using this API: https://www.ncbi.nlm.nih.gov/pmc/tools/id-converter-api/
Google Scholar always return me the ISSN and Publisher, although they don't have an API, but probably there are others that do. Arthurfragoso (talk) 03:05, 2 July 2019 (UTC)
- Citoid only returns primary data. It’s doesn’t do the next step of asking PubMed do you recognize this doi. ~ AManWithNoPlan (talk) 03:18, 2 July 2019 (UTC)
- Ok, now I understand. Who retrieves the data is Zetero. I installed it and did some tests:
- If I do a DOI search, it returns me fewer data: https://pastebin.com/2rY86CVf
- If I do an URL search, it returns me DOI, PMID, PMC, etc: https://pastebin.com/8mLQ696X
- I tried to install citoid, but it failed to build, so I tested the wikipedia server:
- DOI search: https://en.wikipedia.org/api/rest_v1/data/citation/mediawiki/10.1053%2Fj.ackd.2013.08.006
- URL search: https://en.wikipedia.org/api/rest_v1/data/citation/mediawiki/https%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fpubmed%2F24206604
- The URL returns me unknown_error, that's probably why I don't use it. :(
- "type":"https://mediawiki.org/wiki/HyperSwitch/errors/unknown_error" Arthurfragoso (talk) 22:48, 2 July 2019 (UTC)
- Apparently it's only with NIH.gov websites, it was already reported here:
- https://en.wikipedia.org/wiki/Wikipedia_talk:RefToolbar#Auto-fill_based_on_PMID_is_down
- and here:
- https://phabricator.wikimedia.org/T226088 Arthurfragoso (talk) 01:38, 3 July 2019 (UTC)
- Thanks for reporting, that nothing is getting through from the pubmed website at all is separate issue from the one above! And a much more serious problem, unfortunately. Mvolz (WMF) (talk) 11:17, 8 July 2019 (UTC)
- So, we actually added support for this using the pubmed api in 2014 (https://phabricator.wikimedia.org/T1088). Unfortunately, the NIH api has a long, long history of falling over and causing citoid performance issues (https://phabricator.wikimedia.org/T133696). As a result we added a config variable and turned off requesting extra identifiers from their service in production in 2017 (https://phabricator.wikimedia.org/T162886). Since citoid has to be snappy enough to work in real time on user request, I don't see us changing that unless a more reliable / faster service could be found to supply the info.
- My advice would be for a bot to do this work, because that can work in the background and therefore response time isn't as critical. Mvolz (WMF) (talk) 10:47, 3 July 2019 (UTC)
- It's now working! Cheers! Yay! :) Arthurfragoso (talk) 22:06, 10 July 2019 (UTC)
Is there any chance that this will not need nodejs in the future?
[edit]The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
Especially now that parsoid is being ported to use only PHP?
I've had big problems with installing parsoid and visualeditor in the past, so I'm eagerly waiting to be ported to PHP only. Will Citoid also follow, or there's is nothing planned yet? MavropaliasG (talk) 03:39, 28 November 2019 (UTC)
- It will not, sorry; but, as an alternative to hosting your own, you could configure your wiki to make requests to https://en.wikipedia.org/api/rest_ using $wgCitoidFullRestbaseURL. Mvolz (WMF) (talk) 11:24, 29 November 2019 (UTC)
- This is interesting. I also thought that Parsoid will be abandoned. Seems to receive continued support for its usage as a shim to restbase or so. Anyways thanks for the info. [[kgh]] (talk) 11:47, 29 November 2019 (UTC)
- Thank you for the reply. MavropaliasG (talk) 02:23, 30 November 2019 (UTC)