Jump to navigation Jump to search

NEW! meta:User:James Salsman James Salsman (talk) 05:49, 16 March 2017 (UTC)

James Salsman is a statistician and software engineer with over 30 years of speech, signal processing, C, Python, Perl, Javascript, R, SQL, Tcl/Tk, WebRTC, and related experience. He is currently working on speech recognition for pronunciation evaluation, helping people learn to speak and read well. Salsman's contributions to open source software include substantial improvements to the phase vocoder algorithm efficiency, upgrades to TCL and Android, and work to patch and extend Mediawiki. He is currently working on speech recognition for pronunciation evaluation, helping people learn to speak and read well. He studied computer science and statistics at Carnegie Mellon University.

Other interests include:

  • Hack the Future (volunteer mentor)
  • Google Summer of Code (volunteer mentor)

Selected highlights[edit], 2017: 30 million K-6 English as a Second Language student customers of independent commercial homework app assigned by teachers selecting from competing products in every province of China.

EF Education First, EF Learning Labs, Shanghai, China, 2013–2014: Improved automatic speech recognition (ASR) systems providing pronunciation assessment for English language learning by diagnosing Adobe Flash-based microphone upload channel faults, immediately reversing a 30% accuracy drop prior to my arrival. Architected, validated, and implemented further pronunciation assessment accuracy improvement using Sensory Fluentsoft ASR with phoneme duration and acoustic scores normalized by establishing a leaderboard of exemplar pronunciations from student uploads, achieving a 24% increase in scores’ agreement with a panel of human judges. Prototyped auditory feedback for pronunciation exercises, designed ASR QA systems, and additional word and phrase score improvements on cross-platform mobile and desktop ASR implementations. Several other contributions to processes, internal technical documentation, and online learning functions. Used C, JavaScript, sh, C#, and ObjectiveC on Android, iOS, Linux servers, Windows ASP.NET servers and desktop, and OS X.

Selected publications[edit]

Yuan Gao, Brij Mohan Lal Srivastava, James Salsman (2017) "Spoken English Intelligibility Remediation with PocketSphinx Alignment and Feature Extraction Improves Substantially over the State of the Art." In press:

J. Salsman (July 2014) “Development challenges in automatic speech recognition for computer assisted pronunciation teaching and language learning” in Proceedings of the Research Challenges in Computer Aided Language Learning Conference (CALL 2014) Antwerp, Belgium:

S. Ronanki, J. Salsman, and L. Bo (December 2012) “Automatic Pronunciation Evaluation and Mispronunciation Detection using CMU Sphinx.” in Proceedings of the Workshop on Speech and Language Processing Tools in Education, pp. 61–68. 24th International Conference on Computational Linguistics (COLING 2012) Mumbai, India:

K. Roast and J. Salsman (August 2011) “K3D JavaScript Canvas Library.” Software documentation:

J. Salsman (May 2010) “Asynchronous Microphone Upload – for Pronunciation Assessment, High-Quality, Low-Bandwidth Voice, Speech Transcription, Translation, and Speaker Identification and Verification.” in the Proceedings of the World Wide Web Consortium Workshop on Conversational Applications (W3C CONVAPPS) June 18–19, 2010, Somerset, New Jersey:

J. Salsman (October 2010) “Teaching computers to teach people to read and speak.” One Laptop Per Child San Francisco Bay Area Community Summit (OLPC-SF 2010) presentation. San Francisco, California:

J. Salsman (2005) “ReadSay PROnounce English System.” Self-published commercial software and instructional modules:

J. Salsman (August 2004) “Getting Sorted Indices out of lsort.” Tcl Improvement Proposal (TCL TIP) #217. Tcl Developer Xchange:

 J. P. Salsman (July 1999) “Form-based Device Input and Upload in HTML.” World Wide Web Consortium Note submission from Cisco Systems, San Jose, California:

J. Salsman and H. Alvestrand (May 1999) “The Audio/L16 MIME content type.” Internet Engineering Task Force Request for Comments (IETF RFC 2586)

Interested in[edit]

...among other things. Jsalsman (talk) 00:14, 13 February 2015 (UTC) `