๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
study๐Ÿ“š/์˜ค๋ฅ˜

pytube ์˜ค๋ฅ˜

by ์Šค๋‹ 2022. 10. 30.

pytube ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์„ค์น˜ ํ›„ ์ฝ”๋“œ ์‹คํ–‰ํ•˜๋‹ˆ ์•„๋ž˜์™€ ๊ฐ™์€ ์—๋Ÿฌ 2๊ฐ€์ง€ ๋ฐœ์ƒ

  1. cipher.py ์˜ค๋ฅ˜

  • cipher.py ์ฝ”๋“œ ์ˆ˜์ •
    • ๊ฒฝ๋กœ
      ์œˆ๋„์šฐ : C:/ProgramData\Anaconda3\Lib\site-packages/pytube
      ๋งฅ : spotlight ๊ฒ€์ƒ‰ or ํด๋” Users/[Username]/opt/anaconda3/lib/python3.9/site-packages/pytube
  • https://github.com/pytube/pytube/issues/1281 ์ฐธ๊ณ 
# ์˜ค๋ฆฌ์ง€๋‚  ์ฝ”๋“œ
nfunc=function_match.group(1))

# ๋ฐ”๊พผ ์ฝ”๋“œ
nfunc=re.escape(function_match.group(1)))
# ์˜ค๋ฆฌ์ง€๋‚  ์ฝ”๋“œ
nfunc=function_match.group(1))

# ๋ฐ”๊พผ ์ฝ”๋“œ
nfunc=re.escape(function_match.group(1)))
  1. captions.py ์˜ค๋ฅ˜

# ์˜ค๋ฆฌ์ง€๋‚  ์ฝ”๋“œ

def xml_caption_to_srt(self, xml_captions: str) -> str:
        """Convert xml caption tracks to "SubRip Subtitle (srt)".
        :param str xml_captions:
            XML formatted caption tracks.
        """
        segments = []
        root = ElementTree.fromstring(xml_captions)
        for i, child in enumerate(list(root)):
            text = child.text or ""
            caption = unescape(text.replace("\n", " ").replace("  ", " "),)
            try:
                duration = float(child.attrib["dur"])
            except KeyError:
                duration = 0.0
            start = float(child.attrib["start"])
            end = start + duration
            sequence_number = i + 1  # convert from 0-indexed to 1.
            line = "{seq}\n{start} --> {end}\n{text}\n".format(
                seq=sequence_number,
                start=self.float_to_srt_time_format(start),
                end=self.float_to_srt_time_format(end),
                text=caption,
            )
            segments.append(line)
        return "\n".join(segments).strip()
# ๋ฐ”๋€ ์ฝ”๋“œ

def xml_caption_to_srt(self, xml_captions: str) -> str:
        """Convert xml caption tracks to "SubRip Subtitle (srt)".

        :param str xml_captions:
        XML formatted caption tracks.
        """
        segments = []
        root = ElementTree.fromstring(xml_captions)
        i=0
        for child in list(root.iter("body"))[0]:
            if child.tag == 'p':
                caption = ''
                if len(list(child))==0:
                    # instead of 'continue'
                    caption = child.text
                for s in list(child):
                    if s.tag == 's':
                        caption += ' ' + s.text
                caption = unescape(caption.replace("\n", " ").replace("  ", " "),)
                try:
                    duration = float(child.attrib["d"])/1000.0
                except KeyError:
                    duration = 0.0
                start = float(child.attrib["t"])/1000.0
                end = start + duration
                sequence_number = i + 1  # convert from 0-indexed to 1.
                line = "{seq}\n{start} --> {end}\n{text}\n".format(
                    seq=sequence_number,
                    start=self.float_to_srt_time_format(start),
                    end=self.float_to_srt_time_format(end),
                    text=caption,
                )
                segments.append(line)
                i += 1
        return "\n".join(segments).strip()

๋Œ“๊ธ€