Manual talk:ImportTextFiles.php

About this board

MW 1.35.4, PHP warning when script runs with PHP 7.4, second try

3
2003:C2:3F22:8200:A9D7:4E38:B57:CEE9 (talkcontribs)

Yes. 1.31 is EOL, this is why we upgraded to 1.35.4. Precondition for MW 1.35 is PHP 7.3+, so additionally we upgraded to PHP 7.4.24. Whilst we *never* had a similar warning with 1.31.15 and PHP 7.2.24, *now* we have a PHP warning using the script with MW 1.35.4 and PHP 7.4.24.

So maybe the issue may have been fixed in 1.32, but surely it is *not* fixed in 1.35.4. Hope the description is clear now?

Closing the request without having understood it is, let us say, a little bit impolite.

Reedy (talkcontribs)

Try reading the task that was created? phab:T294170

Line 242 of ExtParser.php in 1.35 https://github.com/wikimedia/mediawiki-extensions-ParserFunctions/blob/REL1_35/includes/ExprParser.php#L242 is

if ( !isset( $this->words[$word] ) ) {


The error message you're quoting seems to be still as it would be on the 1.31 branch - https://github.com/wikimedia/mediawiki-extensions-ParserFunctions/blob/REL1_31/includes/ExprParser.php#L242

break;


It was fixed in https://github.com/wikimedia/mediawiki-extensions-ParserFunctions/commit/d258457e018b which landed in 1.32, and is definitely part of 1.35.


It looks like you've not updated the ParserFunctions extension when you've updated MediaWiki core et al, if line 242 is still giving you an error about a break statement.

2003:C2:3F22:8200:DDFC:20B4:8B4C:47FD (talkcontribs)

You are perfectly right. I beg your pardon! The test and live installations are correct but I mixed up the different wiki paths, and so I "managed" to call the old 1.31 script out of the MW 1.35/PHP7.4 environment. I am to blame, no one else. My request was total bs.

Tcrimsonk (talkcontribs)

The infobox includes a link to ImportTextFiles for version 1.26.3 which is a dead link. ImportTextFiles.php did not exist in mediawiki prior to 1.27. I don't see a way to remove the link from the infobox. Tcrimsonk (talk) 02:20, 6 July 2016 (UTC)

106.68.121.210 (talkcontribs)

after i create a text file that also contains some UTF-8 characters and saving it with UTF-8 encoding, i then import the file with ImportTextFiles.php.

the fiile imports with no errors and the new page is created, but when i edit the page, i get this character soup of questionmarks.

�d�e�v�a� �=� �8 > ' ( > �* % �|�

�t�y�p�e� �=� �d�i�s�c�o�u�r�s�e� �|�

�d�a�t�e� �=� �3� �J�u�l� �1�9�6�4� �p�m� �|�

....etc....

does the php routine convert everything to plain ASCII ?

and can that be avoided?

thanks -rudy

MediaWiki 1.31.1
PHP 7.0.32 (cgi-fcgi)
MySQL 5.6.34-log
106.68.121.210 (talkcontribs)

i found the problem.

my input files were not properly UTF-8 encoded.


MW 1.35.4, PHP warning when script runs with PHP 7.4

3
2003:C2:3F22:8200:ECC6:76B0:8D2F:F64E (talkcontribs)

When used with PHP 7.4.24, the script generates a PHP warning:


PHP Warning: "continue" targeting switch is equivalent to "break". Did you mean to use "continue 2"? in /var/www/prod/mediawiki/extensions/ParserFunctions/includes/ExprParser.php on line 242


No warning in MW 1.31.15 with PHP 7.2.24.

Minor problem, I just wanted to mention ist.

This, that and the other (talkcontribs)
Reedy (talkcontribs)

I'm not sure why MW 1.35.4 in the title...

But 1.31 is EOL, and this issue was fixed in the 1.32 cycle, but never backported.

Usage examples don't match

10
Tcrimsonk (talkcontribs)

In Usage:

php importTextFiles.php [options...] <file> [<file>...]

the command includes <file> but the Example a few paragraphs down does not.

Meanwhile, the --help text when running the command says to use <files> (plural).

When I try to run with <file> or with <files> the script returns the error "The system cannot find the file specified" whether I use an absolute or relative path to the .txt file I want to import.

This, that and the other (talkcontribs)

I suspect the inconsistency comes from the fact that you can use it with multiple filenames, i.e. <file> [<file>...], or with a wildcard, i.e. <files>, just like any command-line tool that accepts multiple filenames.

Do you get errors if you pass the exact same file specifier to cat (Mac/Linux) or type (Windows)?

This, that and the other (talkcontribs)

Oh, and the example is passing <file>: it's the meteo-*.txt at the very end.

Tcrimsonk (talkcontribs)

I created a simple art1.txt which contains only This is some text for art1 and I've placed copies of it in the folder with php.exe, in the maintenance directory, in the main mediawiki install directory, and on the root of C:\.

When I run the following commands from the maintenance directory, I get the following results:

type art1.txt returns This is some text for art1

importTextFiles.php art1.txt returns Argument <files> required!(followed by the entire help message)

importTextFiles.php <file> art1.txt returns The system cannot find the file specified

importTextFiles.php <files> art1.txt returns The system cannot find the file specified

The second command above is why I think the Example on this manual page is incorrect, because the example lacks a <file> or <files> argument.

I've tried all of the above with art1.txt saved as ANSI and as UTF-8 in Notepad.

Tcrimsonk (talkcontribs)

I decided to try using edit.php to see if I got the same result. That script requires that the article already exist, so I created the page Art4 in my wiki and Art4.txt in the maintenance folder.

Sure enough: edit.php <title>Art4 Art4.txt returns The system cannot find the file specified.

So it must be something to do with my PHP installation (Running version 5.6.21) or maybe the way I have the path / .php extension association set up.

For what it's worth, I'm able to run other PHP maintenance scripts fine, as long as they don't require filename arguments.

This, that and the other (talkcontribs)

Does it work if you supply a full path as the filename argument, for example, php importTextFiles.php C:\Whatever\Folders\Blah\art1.txt ?

Tcrimsonk (talkcontribs)

I FINALLY got something to work: C:\>C:\Bitnami\mediawiki-1.26.3-0\php\php.exe C:\Bitnami\mediawiki-1.26.3-0\apps\mediawiki\htdocs\maintenance\importTextFiles.php c:\art1.txt

This command will import art1.txt However, when I use a wildcard to try and import art1, art2, and art3 as follows: C:\>C:\Bitnami\mediawiki-1.26.3-0\php\php.exe C:\Bitnami\mediawiki-1.26.3-0\apps\mediawiki\htdocs\maintenance\importTextFiles.php c:\art*.txt Then I get a differently-worded error: Fatal error: The file 'c:\art*.txt' does not exist!

This, that and the other (talkcontribs)

Does it still fail if you include the "php" command at the beginning of the command line, like:

C:\Bitnami\mediawiki-1.26.3-0\php\php.exe C:\Bitnami\mediawiki-1.26.3-0\apps\mediawiki\htdocs\maintenance\importTextFiles.php c:\art*.txt

If this fails, it is most likely because Windows' command line handling differs from Unix command line handling. This script was written with Unix command line handling in mind. Unix expands wildcards before passing the command line to the program, while Windows does not expand wildcards.

Try editing maintenance/importTextFiles.php. Replace line 66, which begins with $this->error, with the following lines of code:

				$found = false;
				foreach ( glob( $arg ) as $filename ) {
					$found = true;
					$files[$filename] = file_get_contents( $filename );
				}
				if ( !$found ) {
					$this->error( "Fatal error: The file '$arg' does not exist!", 1 );
				}

If that fixes your problem, I'll submit a patch to try to get this fix into MediaWiki 1.28.

Tcrimsonk (talkcontribs)

That fixed it! Thank you!

Nyetman (talkcontribs)

For me

  • using XAMPP on Windows 7 Professional,
  • with the change to importTextFiles.php suggested above
  • and assuming running from c:\xampp

these commands didn't work:

  • php\php htdocs\wiki\maintenance\importTextFiles.php tmp\pages\*.txt
  • php\php htdocs\wiki\maintenance\importTextFiles.php c:\xampp\tmp\pages\*.txt
  • php\php htdocs\wiki\maintenance\importTextFiles.php tmp/pages/*.txt

Only this command worked:

  • php\php htdocs\wiki\maintenance\importTextFiles.php c:/xampp/tmp/pages/*.txt

Something to keep in mind.

Reply to "Usage examples don't match"
200.159.15.248 (talkcontribs)

Hi there.

I managed to setup a documentation wiki for my workteam and I'm making an automation script to export mainframe procedures as text and import them on wiki as pages.

The problem here is that I have a text file with line breaks that the ImportTextFiles is not respecting at all.

They import as plain text which breaks the entire proc format.

This:

<code>

//CRMP301D JOB (CRM),'INTERFACE-CRM',MSGLEVEL=(1,1),REGION=6M,

//        MSGCLASS=X,TIME=999,LINES=999,CLASS=P

//*

//* %%SET %%DAT  = %%ODAY.%%OMONTH.%%OYEAR

//* %%SET %%H    = %%TIME

//*

</code>

Becomes this:

<code>

//CRMP301D JOB (CRM),'INTERFACE-CRM',MSGLEVEL=(1,1),REGION=6M, // MSGCLASS=X,TIME=999,LINES=999,CLASS=P //* //* %%SET %%DAT = %%ODAY.%%OMONTH.%%OYEAR //* %%SET %%H = %%TIME //*

</code>

200.159.15.249 (talkcontribs)

I made a workaround in shell:

4 #!/bin/bash

5 DIR=/home/ubuntu/txt_import/*

6 LANGUAGE=ptb ; export LANGUAGE

7 NLS_LANG="BRAZILIAN PORTUGUESE_BRAZIL.WE8MSWIN1252"; export NLS_LANG

8 LANG=pt_BR.UTF-8; export LANG

9 MWOS=linux; export MWOS

10

11

12 for FILE in $DIR

13 do

14 F=$(basename $FILE)

15     sed -i 's/\r/<br>/g' $FILE

This does what I want.

This, that and the other (talkcontribs)

The importTextFiles.php script treats the content of the text files as wikitext. If you want them interpreted as plain text, simply prefix each file with <pre> on its own line (no need for a closing tag). That way, line breaks will be preserved.

Reply to "CRLF being ignored"
There are no older topics