google-compute-engine"/>
  • 7
name

A PHP Error was encountered

Severity: Notice

Message: Undefined index: userid

Filename: views/question.php

Line Number: 191

Backtrace:

File: /home/prodcxja/public_html/questions/application/views/question.php
Line: 191
Function: _error_handler

File: /home/prodcxja/public_html/questions/application/controllers/Questions.php
Line: 433
Function: view

File: /home/prodcxja/public_html/questions/index.php
Line: 315
Function: require_once

It seems that there is a significant (one minute) delay after boot until the networking actually reaches anything except the metadata server (169.254.169.254). I've tested using the default debian jessie image and also a custom image of mine. I see the same problem.

This can also be verified by connecting to the machine using serial console. Nothing is reachable, not even other machines on the same GCE network, except as said, the metadata server.

Anyone else see this or have any clue what it's about?

EDIT:

This only happens when the machine "cold boots" (ie, is started from scratch).

Roughly i'm seeing the following events:

  • 0s - Press "Start VM" in console
  • 1s - Instance state (according to API) changes to PRIVISIONING
  • 4s - Instance state changes to STAGING
  • 20s - Instance state changes to RUNNING
  • 49s - Instance responds to ping from another VM on same subnet.

Starting a ping when I press "Start VM":

$ ping 10.128.0.5
PING 10.128.0.5 (10.128.0.5) 56(84) bytes of data.
64 bytes from 10.128.0.5: icmp_seq=49 ttl=64 time=1.08 ms
64 bytes from 10.128.0.5: icmp_seq=50 ttl=64 time=0.285 ms

So there is about 25 seconds of network isolation assuming boot (including DHCP) takes around 5 seconds as we can see when rebooting:

Comparing this with a reboot of the instance where it is only unreachable for about 7 seconds (and this includes a couple of seconds for shutdown)

$ ping 10.128.0.5
PING 10.128.0.5 (10.128.0.5) 56(84) bytes of data.
64 bytes from 10.128.0.5: icmp_seq=1 ttl=64 time=1.07 ms
64 bytes from 10.128.0.5: icmp_seq=2 ttl=64 time=0.271 ms
64 bytes from 10.128.0.5: icmp_seq=3 ttl=64 time=0.236 ms
64 bytes from 10.128.0.5: icmp_seq=4 ttl=64 time=0.295 ms
64 bytes from 10.128.0.5: icmp_seq=5 ttl=64 time=0.316 ms
64 bytes from 10.128.0.5: icmp_seq=12 ttl=64 time=0.595 ms
64 bytes from 10.128.0.5: icmp_seq=13 ttl=64 time=0.240 ms
64 bytes from 10.128.0.5: icmp_seq=14 ttl=64 time=0.238 ms
64 bytes from 10.128.0.5: icmp_seq=15 ttl=64 time=0.299 ms
      • 2
    • I was not able to reproduce this with a debian-8-jesse instance. Can you go through instance serial console output to confirm if the instance has booted properly and if your instance was assigned an internal IP?

A GCP employee explained the reason why on hacker news https://news.ycombinator.com/item?id=15343888

We're working on it. It's a major initiative for us, because as you see we get the damn thing "booting" in a handful of seconds and then reachability to/from the internet is the long pole. Fwiw, we at least got to/from *.googleapis.com way down, so if you need to say fetch something from GCS, that should be a bit faster.

At a high level, it's the result of having global, flat Networks and not wanting to declare the network "up" until you've "programmed" all the routes. So if you have 1000 VMs distributed globally, you get to make sure that they're not "connected" until your new VM in asia-east1-a can talk to all other VMs in your Network (and vice versa). With the to/from API path this routing is much simpler since you don't get the N^2 behavior.

  • 2
Reply Report

Warm tip !!!

This article is reproduced from Stack Exchange / Stack Overflow, please click

Trending Tags