Thursday, May 16, 2019

DataEng: GCS: Connect to Datalab


Before working on Jupyter Notebook where it hosts by Datalab Computer Engine, the user must connect to it first.


To list projects under an account.
Command: Datalab list

C:\Users\xxxxxxxx\AppData\Local\Google\Cloud SDK>datalab list
NAME                ZONE           MACHINE_TYPE   PREEMPTIBLE  INTERNAL_IP  EXTERNAL_IP  STATUS
datalab-kl-project  us-central1-c  n1-standard-1               10.128.11.12                TERMINATED


Terminate indicating the project stopped. This is different than project doesn't exist. (another blog shows more detail on how to start the project.)

Command: datalab connect <datalab_name> --port <port_num>
Example: datalab connect datalab-kl-project --port 8123

Note: Datalab name is not Project Name. 



[Learned]

Error: (gcloud.compute.onstances.list) Some requests did not succeed:
 - Failed to find project datalab-1-123456

A nested call to gcloud failed, use --verbosity=debug for more info.

This means the project is NOT currently existed or it has been shutdown and staying in Pending To Delete section. It needs to be created or restored from the pending to delete list It needs to be started from the GCP Console. Once the project is started, the Billing needs to be enabled for the project. Perform a search on the project again. At this point, GCP seems to have little glitches in the Compute Engine saying, unable to locate the resource or project even though it has been started.


[Learned]
There might be a possibility that Firewall will prevent the ability to connect to the GCP. If disabling the firewall is possible, perform it temporarily for the connectivity to happen. If it connects successfully, it will look similar to the following.

There might not be a specific error message related to security but will look like the following messages.


Waiting for Datalab to be reachable at http://localhost:8123
Timeout waiting for the connection to become healthy.Trying again with a new connection...
Connection closed
Attempting to reconnect...
Waiting for Datalab to be reachable at http://localhost:8123
.....

If it connects without incidents, it should look like the following and the Datalab and Jupyter Notebook is ready.

C:\> datalab connect datalab-kl-project --port 8123
Connecting to datalab-kl-project.
This will create an SSH tunnel and may prompt you to create an rsa key pair. To manage these keys, see https://cloud.google.com/compute/docs/instances/adding-removing-ssh-keys
Waiting for Datalab to be reachable at http://localhost:8123/

The connection to Datalab is now open and will remain until this command is killed.
You can connect to Datalab at http://localhost:8123/


No comments:

Post a Comment

Pandas: SQL Like pandas operations

Pandas's SQL Like operations such as WHERE clause. = != >= str.contains() & | .isin() .isnull() .notnull() ....