Topic on User talk:Daniel Kinzler (WMDE)/Job Queue

Long jobs vs large strings of tiny jobs

1
Brooke Vibber (WMF) (talkcontribs)

For TimedMediaHandler's video transcoding, currently we use one job per output file, which can run anywhere from a couple minutes to many hours -- this has been problematic when there are large floods of uploads that need handling simultaneously, or batch reprocessing for changed output formats.

As part of a planned pivot of output formats from flat-files to MPEG-DASH streaming we'll have the opportunity to divide up the output into many small files, which gives the opportunity to divide the large jobs up into (potentially a lot of) small jobs converting about 10 seconds of video per job.

I've still got open questions about how to handle this sort of job. If we have a long video with 500 segments, do we:

  • queue up 500 segment jobs and let them drain out?
  • or queue up one or a few jobs for the first chunks, which "fan out" to produce the following jobs as they complete?
  • or something else?

Concerns:

  • need to be able to cancel if the file is deleted, moved, etc -- can we cancel the jobs and remove them, or just have to set a flag that lets them cancel out once they get run?
  • want to avoid floods of batch jobs, but how to set priorities? (new files > rework of old files? low resolutions > high resolutions?) Current scheme uses two queues to ensure that low resolutions are processed alongside high resolutions.

Any thoughts?

Reply to "Long jobs vs large strings of tiny jobs"