If you like to copy files from your local computer to the Hadoop Distributed File System (HDFS) running inside the Docker container, you first copy them to a directory inside the container as described in the previous step. Then you can use the hdfs
command to copy them to HDFS (and vice versa). Let’s illustrate this with an example. The example assume that a file test.txt
is stored in the /tmp
directory inside the container.
Log into the container (as described in step 2) and navigate to the /tmp
directory. Then copy test.txt
to HDFS.
hdfs dfs -copyFromLocal test.txt /tmp/test.txt
Open http://localhost:50070
and navigate to Utilities -> Browse the file system to check that the file test.txt
exists in HDFS.
Now delete test.txt
from the /tmp
directly inside the container (and not inside HDFS). Then copy test.txt
from HDFS back to the container directory.
hdfs dfs -copyToLocal /tmp/test.txt test.txt
Finally delete test.txt
from HDFS again (you can verify the deletion via http://localhost:50070
).
hdfs dfs -rm /tmp/test.txt