Jump to content

Manual talk:ImportTextFiles.php

Add topic
From mediawiki.org
Latest comment: 4 years ago by 2003:C2:3F22:8200:DDFC:20B4:8B4C:47FD in topic MW 1.35.4, PHP warning when script runs with PHP 7.4, second try

Source code for 1.26.3?

[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


The infobox includes a link to ImportTextFiles for version 1.26.3 which is a dead link. ImportTextFiles.php did not exist in mediawiki prior to 1.27. I don't see a way to remove the link from the infobox. Tcrimsonk (talk) 02:20, 6 July 2016 (UTC)Reply

The discussion above is closed. Please do not modify it. No further edits should be made to this discussion.

Usage examples don't match

[edit]

In Usage:

php importTextFiles.php [options...] <file> [<file>...]

the command includes <file> but the Example a few paragraphs down does not.

Meanwhile, the --help text when running the command says to use <files> (plural).

When I try to run with <file> or with <files> the script returns the error "The system cannot find the file specified" whether I use an absolute or relative path to the .txt file I want to import. Tcrimsonk (talk) 18:24, 10 July 2016 (UTC)Reply

I suspect the inconsistency comes from the fact that you can use it with multiple filenames, i.e. <file> [<file>...], or with a wildcard, i.e. <files>, just like any command-line tool that accepts multiple filenames.
Do you get errors if you pass the exact same file specifier to cat (Mac/Linux) or type (Windows)? This, that and the other (talk) 01:13, 11 July 2016 (UTC)Reply
Oh, and the example is passing <file>: it's the meteo-*.txt at the very end. This, that and the other (talk) 01:14, 11 July 2016 (UTC)Reply
I created a simple art1.txt which contains only This is some text for art1 and I've placed copies of it in the folder with php.exe, in the maintenance directory, in the main mediawiki install directory, and on the root of C:\.
When I run the following commands from the maintenance directory, I get the following results:
type art1.txt returns This is some text for art1
importTextFiles.php art1.txt returns Argument <files> required!(followed by the entire help message)
importTextFiles.php <file> art1.txt returns The system cannot find the file specified
importTextFiles.php <files> art1.txt returns The system cannot find the file specified
The second command above is why I think the Example on this manual page is incorrect, because the example lacks a <file> or <files> argument.
I've tried all of the above with art1.txt saved as ANSI and as UTF-8 in Notepad. Tcrimsonk (talk) 01:20, 12 July 2016 (UTC)Reply
I decided to try using edit.php to see if I got the same result. That script requires that the article already exist, so I created the page Art4 in my wiki and Art4.txt in the maintenance folder.
Sure enough: edit.php <title>Art4 Art4.txt returns The system cannot find the file specified.
So it must be something to do with my PHP installation (Running version 5.6.21) or maybe the way I have the path / .php extension association set up.
For what it's worth, I'm able to run other PHP maintenance scripts fine, as long as they don't require filename arguments. Tcrimsonk (talk) 04:52, 12 July 2016 (UTC)Reply
Does it work if you supply a full path as the filename argument, for example, php importTextFiles.php C:\Whatever\Folders\Blah\art1.txt ? This, that and the other (talk) 06:57, 12 July 2016 (UTC)Reply
I FINALLY got something to work:
C:\>C:\Bitnami\mediawiki-1.26.3-0\php\php.exe C:\Bitnami\mediawiki-1.26.3-0\apps\mediawiki\htdocs\maintenance\importTextFiles.php c:\art1.txt
This command will import art1.txt However, when I use a wildcard to try and import art1, art2, and art3 as follows:
C:\>C:\Bitnami\mediawiki-1.26.3-0\php\php.exe C:\Bitnami\mediawiki-1.26.3-0\apps\mediawiki\htdocs\maintenance\importTextFiles.php c:\art*.txt
Then I get a differently-worded error:
Fatal error: The file 'c:\art*.txt' does not exist! Tcrimsonk (talk) 15:29, 12 July 2016 (UTC)Reply
Does it still fail if you include the "php" command at the beginning of the command line, like:
C:\Bitnami\mediawiki-1.26.3-0\php\php.exe C:\Bitnami\mediawiki-1.26.3-0\apps\mediawiki\htdocs\maintenance\importTextFiles.php c:\art*.txt
If this fails, it is most likely because Windows' command line handling differs from Unix command line handling. This script was written with Unix command line handling in mind. Unix expands wildcards before passing the command line to the program, while Windows does not expand wildcards.
Try editing maintenance/importTextFiles.php. Replace line 66, which begins with $this->error, with the following lines of code:
				$found = false;
				foreach ( glob( $arg ) as $filename ) {
					$found = true;
					$files[$filename] = file_get_contents( $filename );
				}
				if ( !$found ) {
					$this->error( "Fatal error: The file '$arg' does not exist!", 1 );
				}
If that fixes your problem, I'll submit a patch to try to get this fix into MediaWiki 1.28. This, that and the other (talk) 03:12, 13 July 2016 (UTC)Reply
That fixed it! Thank you! Tcrimsonk (talk) 16:41, 15 July 2016 (UTC)Reply
For me
  • using XAMPP on Windows 7 Professional,
  • with the change to importTextFiles.php suggested above
  • and assuming running from c:\xampp
these commands didn't work:
  • php\php htdocs\wiki\maintenance\importTextFiles.php tmp\pages\*.txt
  • php\php htdocs\wiki\maintenance\importTextFiles.php c:\xampp\tmp\pages\*.txt
  • php\php htdocs\wiki\maintenance\importTextFiles.php tmp/pages/*.txt
Only this command worked:
  • php\php htdocs\wiki\maintenance\importTextFiles.php c:/xampp/tmp/pages/*.txt
Something to keep in mind. Nyetman (talk) 14:04, 4 November 2016 (UTC)Reply

CRLF being ignored

[edit]

Hi there.

I managed to setup a documentation wiki for my workteam and I'm making an automation script to export mainframe procedures as text and import them on wiki as pages.

The problem here is that I have a text file with line breaks that the ImportTextFiles is not respecting at all.

They import as plain text which breaks the entire proc format.

This:

<code>

//CRMP301D JOB (CRM),'INTERFACE-CRM',MSGLEVEL=(1,1),REGION=6M,

//        MSGCLASS=X,TIME=999,LINES=999,CLASS=P

//*

//* %%SET %%DAT  = %%ODAY.%%OMONTH.%%OYEAR

//* %%SET %%H    = %%TIME

//*

</code>

Becomes this:

<code>

//CRMP301D JOB (CRM),'INTERFACE-CRM',MSGLEVEL=(1,1),REGION=6M, // MSGCLASS=X,TIME=999,LINES=999,CLASS=P //* //* %%SET %%DAT = %%ODAY.%%OMONTH.%%OYEAR //* %%SET %%H = %%TIME //*

</code> 200.159.15.248 (talk) 18:19, 26 August 2016 (UTC)Reply

I made a workaround in shell:
4 #!/bin/bash
5 DIR=/home/ubuntu/txt_import/*
6 LANGUAGE=ptb ; export LANGUAGE
7 NLS_LANG="BRAZILIAN PORTUGUESE_BRAZIL.WE8MSWIN1252"; export NLS_LANG
8 LANG=pt_BR.UTF-8; export LANG
9 MWOS=linux; export MWOS
10
11
12 for FILE in $DIR
13 do
14 F=$(basename $FILE)
15     sed -i 's/\r/<br>/g' $FILE
This does what I want. 200.159.15.249 (talk) 19:12, 26 August 2016 (UTC)Reply
The importTextFiles.php script treats the content of the text files as wikitext. If you want them interpreted as plain text, simply prefix each file with <pre> on its own line (no need for a closing tag). That way, line breaks will be preserved. This, that and the other (talk) 09:40, 27 August 2016 (UTC)Reply

broken UTF-8

[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


after i create a text file that also contains some UTF-8 characters and saving it with UTF-8 encoding, i then import the file with ImportTextFiles.php.

the fiile imports with no errors and the new page is created, but when i edit the page, i get this character soup of questionmarks.

�d�e�v�a� �=� �8 > ' ( > �* % �|�

�t�y�p�e� �=� �d�i�s�c�o�u�r�s�e� �|�

�d�a�t�e� �=� �3� �J�u�l� �1�9�6�4� �p�m� �|�

....etc....

does the php routine convert everything to plain ASCII ?

and can that be avoided?

thanks -rudy

MediaWiki 1.31.1
PHP 7.0.32 (cgi-fcgi)
MySQL 5.6.34-log
106.68.121.210 (talk) 08:04, 7 February 2019 (UTC)Reply
i found the problem.
my input files were not properly UTF-8 encoded.

106.68.121.210 (talk) 10:36, 7 February 2019 (UTC)Reply
The discussion above is closed. Please do not modify it. No further edits should be made to this discussion.

MW 1.35.4, PHP warning when script runs with PHP 7.4

[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


When used with PHP 7.4.24, the script generates a PHP warning:


PHP Warning: "continue" targeting switch is equivalent to "break". Did you mean to use "continue 2"? in /var/www/prod/mediawiki/extensions/ParserFunctions/includes/ExprParser.php on line 242


No warning in MW 1.31.15 with PHP 7.2.24.

Minor problem, I just wanted to mention ist. 2003:C2:3F22:8200:ECC6:76B0:8D2F:F64E (talk) 07:17, 22 October 2021 (UTC)Reply

I've reported this at phab:T294170. This, that and the other (talk) 01:42, 23 October 2021 (UTC)Reply
I'm not sure why MW 1.35.4 in the title...
But 1.31 is EOL, and this issue was fixed in the 1.32 cycle, but never backported. Reedy (talk) 13:38, 23 October 2021 (UTC)Reply
The discussion above is closed. Please do not modify it. No further edits should be made to this discussion.

MW 1.35.4, PHP warning when script runs with PHP 7.4, second try

[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Yes. 1.31 is EOL, this is why we upgraded to 1.35.4. Precondition for MW 1.35 is PHP 7.3+, so additionally we upgraded to PHP 7.4.24. Whilst we *never* had a similar warning with 1.31.15 and PHP 7.2.24, *now* we have a PHP warning using the script with MW 1.35.4 and PHP 7.4.24.

So maybe the issue may have been fixed in 1.32, but surely it is *not* fixed in 1.35.4. Hope the description is clear now?

Closing the request without having understood it is, let us say, a little bit impolite. 2003:C2:3F22:8200:A9D7:4E38:B57:CEE9 (talk) 18:17, 23 October 2021 (UTC)Reply

Try reading the task that was created? phab:T294170
Line 242 of ExtParser.php in 1.35 https://github.com/wikimedia/mediawiki-extensions-ParserFunctions/blob/REL1_35/includes/ExprParser.php#L242 is
if ( !isset( $this->words[$word] ) ) {
The error message you're quoting seems to be still as it would be on the 1.31 branch - https://github.com/wikimedia/mediawiki-extensions-ParserFunctions/blob/REL1_31/includes/ExprParser.php#L242
break;
It was fixed in https://github.com/wikimedia/mediawiki-extensions-ParserFunctions/commit/d258457e018b which landed in 1.32, and is definitely part of 1.35.
It looks like you've not updated the ParserFunctions extension when you've updated MediaWiki core et al, if line 242 is still giving you an error about a break statement. Reedy (talk) 14:07, 26 October 2021 (UTC)Reply
You are perfectly right. I beg your pardon! The test and live installations are correct but I mixed up the different wiki paths, and so I "managed" to call the old 1.31 script out of the MW 1.35/PHP7.4 environment. I am to blame, no one else. My request was total bs. 2003:C2:3F22:8200:DDFC:20B4:8B4C:47FD (talk) 14:58, 26 October 2021 (UTC)Reply
The discussion above is closed. Please do not modify it. No further edits should be made to this discussion.