Fecha: January 24th, 2010 | Categoría: Personal | 3 Comments »
Para no cortar tanto con la costumbre de bloggear, acá vá un recorte de un topic que discutíamos en RolFe (un foro privado que tenemos con mis amigos roleros).
Algo de humor (imagen larga)
Mis metas para el 2010 son: Read the rest of this entry »
Fecha: January 5th, 2010 | Categoría: Internet | 13 Comments »
Go to the Online version
This is what I’ve been working on today. It’s a simple console-based script to download subtitles for TED Talks – since I haven’t found a way to download them directly from the web in a compatible format (I generally use ‘.srt’ subtitles). Here is the script made in python. TEDTalkSubtitles.py
Key parts of the program:
A simple function to parse the value in miliseconds to something like “00:34:32,334″:
-
def getFormatedTime(intvalue):
-
mils = intvalue%1000
-
segs = (intvalue/1000)%60
-
mins = (intvalue/60000)%60
-
hors = (intvalue/3600000)
-
return "%02d:%02d:%02d,%03d"%(hors,mins,segs,mils)
With this recursive function, fetch available languages for the talk
-
def availableSubs(subs):
-
a = subs.find("LanguageCode")
-
if a == -1:
-
return []
-
subs = subs[a+len("LanguageCode"):]
-
return [re.search("%22([^A-Z]+)%22", subs).group(1)] + availableSubs(subs)
Get information about the video
-
def getVideoParameters(urldirection):
-
ht = urllib.urlopen(urldirection).read()
-
var = re.search(‘flashVars = {\n([^}]+)}’, ht)
-
if var:
-
var = var.group(1)
-
else:
-
return None
-
var = [a.replace(‘\t‘, ”) for a in var.split(‘\n‘)]
-
for a in range(len(var)):
-
if var[a]:
-
var[a] = var[a][:var[a].rfind(‘,’)]
-
resultado = []
-
for a in var:
-
l = a.find(‘:’)
-
if l != -1:
-
resultado.append((a[:l], a[l+1:]))
-
return dict(resultado)
Getting it all together:
-
def downloadSub(idtalk, lang, timeIntro):
-
print("Downloading subtitles for language %s"%lang)
-
c = simplejson.load(urllib.urlopen(‘http://www.ted.com/talks/subtitles/id/%s/lang/%s’%(idtalk, lang)))
-
salida = file(’subs_%s_%s.srt’%(idtalk,lang), ‘w’)
-
conta = 1
-
c = c[‘captions’]
-
for linea in c:
-
salida.write("%d\n"%conta)
-
conta += 1
-
salida.write("%s –> %s\n"%(getFormatedTime(timeIntro+linea[’startTime’]), getFormatedTime(timeIntro+linea[’startTime’]+linea[‘duration’])))
-
salida.write("%s\n\n"%(linea[‘content’].encode(‘utf-8′)))
-
salida.close()
Related to:
Parsing and Converting TED Talks JSON Subtitles
Download subtitles from TED talks for offline viewing
Fecha: January 1st, 2010 | Categoría: Personal | 1 Comment »
Reanudando con este blog personal que llevo, escribo para reanudar mi frecuencia al postear, y saludar a quién lea esto para desearle un Feliz, Exitoso, y Próspero año nuevo.
Para agregar algo más de contenido, quiero escribir unas líneas sobre autores de distintos blogs que están en mi muy limitado lector de RSS (agregar más cosas me genera ruido).
Read the rest of this entry »