User talk:Nischayn22/Gsoc

Jump to navigation Jump to search

About this board

MWJames (talkcontribs)

Hi Nischay,

Welcome to the SMW world, and I hope you will find it satisfactory and rewarding to work with Semantic MediaWiki. I thing everyone using SMW is looking forward to see some performance enhancements especially for table intensive select queries. For example in our case we store around 1.1M triples and we feel the pinch everyday when performance degrades (we already have memcache, squid, APC in place to deal with the more broader aspects of performance latency) while people start making multiple queries by either using #ask or Semantic Drilldown.

So good luck, and feel free the contact the community.

Cheers

Nischayn22 (talkcontribs)

Hi,

Thanks for the warm wishes :)

I do feel rewarded getting the chance to work on this project. This project aims at identifying the performance degradation issues with SMW and thus has high scope of learning from user experiences, therefore your feedback is highly valuable to me. I will be actively working on this project from May 8th and will shortly mail the SMW-users list to take further feedback from the large community of users. My initial concerns will be looking into the unnecessary write queries that SMW makes on each page edit; this will give me the required familiarity with the SMW code to further look on the issues related to performance degradation of #ask queries. While issues related to SMW extensions are still in the *If time permits* category right now, I aim to look into them after this summer.

Nischayn22 (talkcontribs)

It would be really helpful if you could mention what performance degradation you are having in specific. Are the queries running slow or is it something else that is pinching you?

MWJames (talkcontribs)

On a quantitative level I can't really pinpoint the areas where issues exists but we get feedback from users that in case they use (see below) problems arise. Of course all this depends on how many users do similar activities at the same time.

  • Compound #ask queries with limit > 250
  • Semantic Drilldown + display filters > 50
  • Tagcloud maxtags > 2000 + limit > 250

The above reason have drawn use to avoid running #ask queries directly within a page itself and encourage users to display those queries instead with limit=0 + searchlabel=... so that execution is only carried out on demand when a user feels to need to see query results. Nevertheless we do indicate how many possible results the user can expect for a specific query be indicating the count amount.

{{#ask: <query>|limit=0|searchlabel={{#ask: <same query>|format=count}} }}

Badon (talkcontribs)

I have suspected that part of the performance problem might be helped by using SSD storage, where parallel reads and writes from the database have little or no impact on performance. I'm planning to try this eventually, but since redundant SSD storage is very expensive, I'd like to delay it as long as possible. So, maybe I'll be able to test it with our future, optimized SMW version that Nischayn22 is working on.

Reply to "Welcome"
There are no older topics