Python – WordCloud in Japanese – VS Code on Ubuntu No.90

How to implement Word Cloud in Japanese using python in Visual Studio Code on Ubuntu is shown in this blog.

▼1. What is WordCloud?

WordCloud is a visualization tool to show the valuable word using the size of a word based on the frequency. The wiki of 2022 FIFA Qatar world cup is used for the wordcloud in Japanese. 2022 FIFAワールドカップ

wordcloud-FIFA2022-quatar

▼2. Prerequisites

2-1. Preparing for Python environment

Python – Visual Studio Code の利用 No.34

2-2. Installing WordCloud

https://pypi.org/project/wordcloud/

pip install wordcloud

2-3. Downloading Japanese Font

IPAex フォント Ver.004.01 | 一般社団法人 文字情報技術促進協議会 (moji.or.jp)

unzip ipaexm00401.zip

2-4. Copying Japanese text from Wiki

Textjpnv2.txt file is created by copying text from this Wiki.

2022 FIFAワールドカップ Wikipedia

2-5. Starting VS Code

Creating the folder “wordcloudtest” and run code . command to open VS Code.

mkdir wordcloudtest
cd wordcloudtest
code .

▼3. Let’s run WordCloud using Python

3-1. Creating Python code to run WordCloud

Creating python file “wctest3.py” to run WordCloud. the font path is set to a “font_path” parameter in this code.

import numpy as np
import matplotlib.pyplot as plt
from wordcloud import WordCloud

x, y = np.ogrid[:300, :300]

mask = (x - 150) ** 2 + (y - 150) ** 2 > 130 ** 2
mask = 255 * mask.astype(int)

with open('textjpnv2.txt', 'r') as f:
    text = f.read()

wc=WordCloud(mask=mask,background_color="white",font_path="/home/xxx/wordcloudtest/ipaexm00401/ipaexm.ttf").generate(text)
plt.imshow(wc, interpolation='bilinear')
plt.axis("off")
plt.show()

3-2. Result

wordcloud-FIFA2022-quatar

▼4. Reference

  1. Get Started Tutorial for Python in Visual Studio Code
  2. https://pypi.org/project/wordcloud/
  3. https://amueller.github.io/word_cloud/auto_examples/single_word.html
  4. IPAフォントのダウンロード方法と使い方 | 趣味や仕事に役立つ初心者DIYプログラミング入門 (resanaplaza.com)
  5. IPAex フォント Ver.004.01 | 一般社団法人 文字情報技術促進協議会 (moji.or.jp)

That’s all. Have a nice day ahead !!!

Leave a Reply

Your email address will not be published. Required fields are marked *