Extension talk:Bullet Feed

From mediawiki.org
Latest comment: 8 years ago by Jonathan3 in topic My revision

This is my first public extension and have yet to really dive into code and make something really useful. I know that other great php programmer on this site have made probably better structured scripts for the same thing, but I couldn't find one that I could readily put in a minimalistic format such as found on http://hdwiki.nmu.edu. If you have any questions, comments, etc, feel free to email me at cperry@nmu.edu -03:42, 6 February 2008 (UTC)

  • Are you aware that Google has flagged your link?

Updates[edit]

  • Updated the links algorithm again to take into account $wgServer instead of computing my own.
-muraj 07:17, 3 April 2008 (UTC)Reply
  • Fixed descriptions not parsing html into html code.
  • Implemented a more portable algorithm for parsing default links
  • Default site address is now http://domain_name/$wgScriptPath
-muraj 15:43, 28 February 2008 (UTC)Reply

Couple of changes[edit]

I made a couple minor changes to the code - adding $wgExtensionCredits so it appears in Special:Version and changing the alt URL generator slightly ($header). I dug into $wgTitle and found that there are quite a few methods for URL generation which helped with that task. $site and $link in feedAction can probably be optimized quite a bit (I didn't spend too long trying to decode what was going on in $link). They both seem to work OK for me though so I left them alone. Thanks for the extension. 67.97.209.36 22:41, 29 July 2008 (UTC)Reply

Possible to use RSS page as normal viewable page too?[edit]

Is it possible to use the bullet-list page as a normal viewable page too? i.e. so it doesn't display the <bullet_feed> and <rss-desc> tags on the page? I already have a bullet list "news" page so this extension could be ideal... Thanks Jonathan3 07:08, 24 September 2008 (UTC)Reply

To answer my own question... the bullet points between the <bullet_feed> tags do get displayed as normal; however, the description between the <rss-desc> tags don't get displayed on the wiki page. The extension seems to work very well. Jonathan3 18:58, 2 October 2008 (UTC)Reply
There is a reason for this, I was thinking on making this optional in a future edition, but I really wanted to keep the layout of a bulleted list for a minimalist headline for a frontpage, and that they could access the same content albeit more detailed via the rss action page. -muraj 01:20, 4 October 2008 (UTC)Reply

Minor amendment - change link on item's title[edit]

Some of my bullet points had links to several wiki pages (in the RSS title as opposed to the descriptions). The code picks the first one and makes the RSS item link to that. To work round this I changed it so that the link is always the "Recent Changes" page of the wiki. I know this isn't ideal, but it saved me some trouble rejigging my bullet points. Change the relevant line to:

$i=new FeedItem(strip_tags($title),$desc[1],'http://www.mysite.co.uk/Special:Recentchanges',$Date=wfTimestampNow()-$titlekey);

I've not made any changes to the code on the extension page. Jonathan3 18:58, 2 October 2008 (UTC)Reply

This was intentional too, as the need at the time was one link per title, which really is all you need if you make your headlines correct. Though this assumption is true for most cases, I would like to make it more robust to have it optionally select a site, as well as optionally have it dynamically set a site (the first site) or have a tag to set a static site. --muraj 01:28, 4 October 2008 (UTC)Reply
There is also a reason for the $link.'#'.striptags($title). Some readers grab new items based off whether the link has been read or not, not based off the publication date. The anchor I append to that makes it so that the reader will see the new items regardless if this happens. --muraj 15:53, 6 October 2008 (UTC)Reply

Minor amendment - display text inside rss-desc tags on wiki page[edit]

All links in the description work on the RSS feed whereas in the title only one can be used (the first one is selected, unless you make the amendment above). So I decided to use the rss-desc tags after all - but the text between the rss-desc tags doesn't get displayed on the wiki page. I wanted to use the rss-desc tags (so that multiple links in the news item could be shown) and wanted the content to be displayed on the wiki page as well as on the RSS. I changed the relevant line to:

    return $parser->recursiveTagParse($input).'<!-- rss-desc text="'.$parser->recursiveTagParse($input).'" -->';

I have NOT made any changes to the original text on the Extension page :-) Jonathan3 19:24, 2 October 2008 (UTC)Reply

This "hidden" text is a feature, but you're right, I should make this optional to the feed. --muraj 01:33, 4 October 2008 (UTC)Reply

RSS autodiscovery doesn't seem to work on IE7[edit]

...although it works fine in Firefox. Jonathan3 20:01, 2 October 2008 (UTC)Reply

Not sure why this is, make sure you clear your cache for both the wiki and your browser. I use the <link rel="alternate" /> tags in the feedparse function to have the rss icon show in the browser. Not sure what IE7 looks for, but it worked on both at the time of submission. --muraj 01:39, 4 October 2008 (UTC)Reply

Links within description text work with Google Reader but not Google home page[edit]

On Google reader the links work fine, e.g. http://www.mysite.co.uk/Page name On i-Google home page the links render as, e.g., http://www.google.co.uk/Page name Does anyone know why this is? Jonathan3 22:07, 2 October 2008 (UTC)Reply

I've heard of this issue, I'm not sure as to the cause, I will look into it. --muraj 01:40, 4 October 2008 (UTC)Reply

More serious amendment - constant pubdate for each item required[edit]

The code as it stands on the extension page doesn't seem properly to assign a pubdate to each item. It basically uses the date at the time of the most recent refresh. The effect is that if you have 10 items, the RSS reader adds those 10 items on every refresh, so that in no time the reader will have 100 or 200 items for your feed instead of 10. To make the pubdate fixed for each item, I used the following work-around.

The bullet points are in the format:

* 01/01/08: Something happened today <rss-desc>Detail</rss-desc>
* 01/05/08: Something happened on 01/03/08 as well <rss-desc>Detail</rss-desc>

I wanted to make the first date on each bullet point become the pubdate, so made the following change. Remove the following line (which you might have already amended: see above).

$i=new FeedItem(strip_tags($title),$desc[1],$link.'#'.strip_tags($title),$Date=wfTimestampNow()-$titlekey);

Replace with:

$pat = "([0-9]{1,2})/([0-9]{1,2})/([0-9]{2})";
$regs = array();
if(ereg($pat,$title,$regs)) {
	$datetimestamp=mktime(0,0,0,$regs[2],$regs[1],$regs[3]);
} 
$i=new FeedItem(strip_tags($title),$desc[1],'http://www.mysite.co.uk/Special:Recentchanges',$Date=$datetimestamp);

[NB The reference to 'http://www.mysite.co.uk/Special:Recentchanges' can be left as before, i.e. $link.'#'.strip_tags($title) -- see above for the reason I made this change]

This seems to make sure that each feed item has a constant date, rather than one which changes on each refresh. Note that the date has to be in the format DD/MM/YY with leading zeroes where necessary. US users might want to change this to MM/DD/YY by changing the relevant part to $regs[1],$regs[3],$regs[3]

I hope this makes sense. If someone can verify that the php code works then perhaps this final change could be made to the code on the Extension page. I haven't made any changes to that page.

Best wishes Jonathan3 21:55, 3 October 2008 (UTC)Reply

Yes, this was sort of a hack as I was looking for a dynamic method of adding in the dates and removing them when the elements were deleted at the time of user manipulation. I'm still debating several options for this, none of which seem very clean. I do not want the user to have to input a date, this should be automatic and the format should be easily modifiable to whatever to user wishes to show on the page. This was going to be a future feature, but as I'm lazy, I haven't gotten around to it :-). The reason I implemented the above was to make it work with particular RSS readers (the Firefox add-on mentioned on the bottom), it should be more robust. --muraj 03:17, 4 October 2008 (UTC)Reply

How do I change the "Trigger"-Symbol?[edit]

Hi! This extension only works with list, with the Symbol "*". I have got a Table with several rows and columns, that are part of the same entry of one rss-news. How?

Not a bad idea, but as of now, the code targets particularly only bulleted lists (IE: with the final html tags <ul> and <ol>). As soon as I get back to editing this code, I might add that feature, but it won't be for a while. --muraj 20:17, 12 February 2009 (UTC)Reply
I found one possible solution for myself: Look out for the line:
$titles=preg_split('/<li>(.*?)/s',$feed);
and change the <li> to whatever you want, for example to <tr>, if you want to use a table as a input. Have much fun! --17:45, 28 February 2009 (UTC)

My revision[edit]

I post following code from my revision of this extension, with some fix to make it validate with RSS XML schema.

Could someone (whoever posted this, or someone who's had time to look through it) possibly describe the changes? Thanks in advance. Jonathan3 (talk) 14:19, 12 April 2015 (UTC)Reply
<?php

/*
*  Bullet_feed, by Cory Perry (cperry@nmu.edu).  Made to keep the formatting 
*  on the main page of the hdwiki while still parsing the data
*  into an xml file built on the fly and accessed in real-time
*
*  Usage:
*  <bullet_feed title="My Title" desc="My Feed's Description">
*  *Title of item #1
*  <rss-desc>Description of news item</rss-desc>
*  *Title of item #2
*  *Title of [[item #3]]
*  <rss-desc>This item has an automatic link</rss-desc>
*  </bullet_feed>
*/
require_once("$IP/includes/Feed.php");
require_once("$IP/includes/Sanitizer.php");
$wgExtensionCredits['parserhook'][] = array(
    'name' => 'Bullet Feed',
    'version' => '1.2b',
    'author' => 'Cory Perry',
    'url' => 'http://www.mediawiki.org/wiki/Extension:Bullet_Feed',
    'description' => 'Allows a minimalistic bullet-style news page thrown into a simple RSS feed',
    'descriptionmsg' => 'bulletfeed-desc',
);
$wgHooks['UnknownAction'][]='feedAction';
$wgExtensionFunctions[] = 'feedSetup';
/*feedSetup:
* Setup the tag parsing hook functions.
*/
function feedSetup() {
    global $wgParser;
    $wgParser->setHook( 'bullet_feed', 'feedParse' );
    $wgParser->setHook('rss-url','urlParse');
    $wgParser->setHook('rss-desc','descParse');
}
/*descParse:
*Parse the <rss-desc> tags and output 
*the html code in comments for parsing later in the action field.
*Used to grab the description of the news item.
*Params:
*$input: text inclosed by the wiki tags
*$args: any arguments passed to the tags, ignored here.
*$parser: pointer to the parser class that holds all 
*the tag/extension modifiers (usually $wgOut)
*/
function descParse( $input, $args, $parser ) {
    return '<span class="rss-desc">' . $parser->recursiveTagParse($input) . '</span>';
}
/*urlParse *UNIMPLEMENTED*:
*Parse the <url-desc> tags and output 
*the html code in comments for parsing later in the action field
*Used to grab a direct URL from the wiki tags. 
*Params:
*$input: text inclosed by the wiki tags
*$args: any arguments passed to the tags, ignored here.
*$parser: pointer to the parser class that holds all 
*the tag/extension modifiers (usually $wgOut)
*/
function urlParse( $input, $args, $parser ) {
    return '<!-- rss-url text="'.$input.'" -->';
}
$glob_match=array();
/*feedParse :
*Surrounds the RSS feed in enclosing comment fields for easy parsing.
*Also adds a bit of html code for supported browsers to redirect to 
*the RSS feed from the main page.
*Params:
*$input: text inclosed by the wiki tags
*$args: any arguments passed to the tags, like title="", desc="" for 
*the site.  All others are added but ignored in the RSS feed action.
*$parser: pointer to the parser class that holds all 
*the tag/extension modifiers (usually $wgOut)
*/
function feedParse( $input, $args, $parser ) {
    global $wgServer, $wgScript, $wgTitle;
    $str=$parser->recursiveTagParse($input); //Get html code of page
    //vv-- Make-up a header that includes the redirect feature.
    $header='<!-- FEED_START -->'.'<!-- ';
    foreach($args as $name=>$value){	//Add the arguments as comments to the code.
    	$header.=$name.'="'.$value.'" ';
    }
    $header.='-->';
    return $header.$str.'<!-- FEED_END -->';	//Return the fully parsed html code where the tags were before.
}

/*cleanLinkAttribute:
*Clean HTML link tag attribute.
*Caller by |cleanLink|.
*Params:
*$matches : Link attribute
*/
function cleanLinkAttribute($matches) {
	$attrName = strtolower(trim($matches[1]));

	// Remove unsupported attributes
	if ($attrName == "onclick") {
		return ""; 
	}

	// ---

	$value = $matches[2];

	if ($attrName == "href") {
		global $wgScriptPath;
		$len = strlen($wgScriptPath);
		if (substr($value, 0, $len) == $wgScriptPath) {
			// expand site url as absolute one
			global $bulletFeedSite;
			$value = $bulletFeedSite . substr($value, $len);
		}
	}

	return $matches[1] . "=\"" . $value . "\"";
}

/*cleanLink:
*Clean link HTML tag from content used as RSS item description.
*Called by |feedAction|.
*Params:
*$matches: Link attributes.
*/
function cleanLink($matches) {
	return "<a" . preg_replace_callback('/(.*?)="(.*?)"/', "cleanLinkAttribute", $matches[1]) . ">";
}

/*feedAction:
*Core code for outputting the actual RSS xml file on the fly.
*Called on by http://mysite.ext/index.php?title=my_page&action=bullet_feed
*Params:
*$action: action and arguments sent via address. (arguments ignored).
*$article: Entire article object to be parsed.
*/
function feedAction($action,$article){
	global $wgOut,$wgScriptPath,$wgServer;

	if(!($action=='bullet_feed')){ return true;}
	//Parse the article into html and strip out the table of contents and edit sections
	$content = $wgOut->parse($article->getContent()."\n__NOEDITSECTION__ __NOTOC__");
	//Get the page address for use as a default link.

	preg_match('/^http:\/\/(.*?)\//s',$article->getTitle()->getFullUrl(),$site);

	$site='http://'.$site[1].$wgScriptPath;
	global $bulletFeedSite;
	$bulletFeedSite  = $site;

	//Cut out all the extra stuff and grab the RSS arguments within the html code.
	preg_match_all('/<!--\\s*FEED_START\\s*-->(.*?)<!--\\s*FEED_END\\s*-->/s',$content,$matches);
	preg_match('/<!--(?:.*?)title=\"(.*?)\"/s',$matches[1][0],$Feedtitle);
	preg_match('/<!--(?:.*?)desc=\"(.*?)\"/s',$matches[1][0],$Feeddesc);

	global $wgHtmlEntities;
	$entpa = array();
	$entre = array();

	foreach ($wgHtmlEntities as $name => $val) {
		$entpa[] = "/&" . $name . ";/";
		$entpa[] = "/&#" . $val . ";/";

		$entre[] = " ";
		$entre[] = " ";
	}

	foreach ($matches[1] as $feedKey=>$feed){
		//Make a new standard RSSFeed object to use to output the xml
		$feedStream = new RSSFeed($Feedtitle[1],$Feeddesc[1], $site);

		ob_start();
		$feedStream->outHeader();
		$rssHeader = preg_replace('/<rss(.*?)>/i', 
		  '<rss\1 xmlns:atom="http://www.w3.org/2005/Atom">', 
		  ob_get_contents());

		ob_end_clean();
		echo $rssHeader;
		echo "\t\t<atom:link href=\"" . $article->getTitle()->escapeFullUrl("action=bullet_feed") . 
			"\" rel=\"self\" type=\"application/rss+xml\" />\n";

		//Split the code into news items.
		$titles=preg_split('/<dt>(.*?)/s',$feed);

		$g = 0;
		foreach($titles as $titlekey=>$title) {
			if (preg_match('/[A-Za-z1-9]+/',strip_tags($title))==0) continue;
			$link = $article->getTitle()->getFullUrl();
			//If there's a link in the title, use that as the RSS link,
			//otherwise, use the default site link.
			preg_match('/<span class="rss-desc">(.*?)<\/span>/s',$title,$desc);

                        // date extract
                        $ts = NULL;

			preg_match('/([0-9]{1,2})\/([0-9]{1,2})\/([0-9]{4})/s', $title, $date);

			if (!isset($date[1])) {
			  preg_match('/([0-9]{4})-([0-9]{1,2})-([0-9]{1,2})/s', $title, $date);

                          $ts = wfTimestamp( TS_MW, strtotime($date[0]) );
			} else {
                          $ts = wfTimestamp( TS_MW, strtotime($date[3] . '-' . $date[2] . '-' . $date[1]) );
                        } 

                        if (isset($date) && $date != NULL) {
                          $title = str_replace($date[0], '', $title);
                        } else {
                          $ts = wfTimestampNow();
                        }

			$title = str_replace($desc,'',$title);
			$title = preg_replace($entpa, $entre, $title);
			if(preg_match('/href=\"(.*?)\"/',$title)>0){
				preg_match('/href="(.*?)"/',htmlspecialchars_decode($title),$temp);
				$link=$temp[1];
				if(preg_match('/^\/(.*?)/',$link)){
					$link=$wgServer.$link;
				}
			}

			//Grab the description of the item.
                        $description = NULL;

                        if (isset($desc[1])) {
                          $description = preg_replace_callback('|<a(.*?)>|i', 
				"cleanLink", strip_tags($desc[1], '<a>'));

                        }

			$i=new FeedItem(strip_tags($title),$description,$link.'#'.urlencode(trim(strip_tags($title))),$ts);

			ob_start();
			$feedStream->outItem($i);

			$rssItem = ob_get_contents();
			ob_end_clean();

			echo preg_replace('|</title>|i', "</title><guid>" . $link . "</guid>", $rssItem);
		}
		$feedStream->outFooter();
		die();//<--Not sure if needed
	}
	return false;//<--Don't let wiki parse this article further.
}
?>

Breaks with MW1.18 (links within description appear as <!--LINK 0:0--> rather than links)[edit]

This problem occurred on upgrade from MW1.16.5 to MW1.18. It occurs on all my browsers.

When there is a [[wikilink]] within the <rss-desc></rss-desc> tags, it is rendered as <!--LINK 0:0--> (further links are <!--LINK 0:1-->, <!--LINK 0:2--> etc). It should just appear as a normal link. I can provide further details if you require.

Help! Thanks Jonathan3 18:45, 11 December 2011 (UTC)Reply

Hmm, I just tested this on a MW 1.18.2 wiki but was not able to recreate. Something else must have gone wrong. However a maintainer for this extension would be cool to have. Cheers --[[kgh]] (talk) 14:19, 29 March 2012 (UTC)Reply
Only just noticed your reply. I'll upgrade to the current version of mediawiki and see if the problem continues for me. I'll give more detail here, either way, when I upgrade. Thanks. Jonathan3 (talk) 09:21, 30 August 2013 (UTC)Reply
Sorry for the long delay! I've tried it again with MW1.24 and the error described above does not appear. Jonathan3 (talk) 14:17, 12 April 2015 (UTC)Reply