• 14
name

A PHP Error was encountered

Severity: Notice

Message: Undefined index: userid

Filename: views/question.php

Line Number: 191

Backtrace:

File: /home/prodcxja/public_html/questions/application/views/question.php
Line: 191
Function: _error_handler

File: /home/prodcxja/public_html/questions/application/controllers/Questions.php
Line: 433
Function: view

File: /home/prodcxja/public_html/questions/index.php
Line: 315
Function: require_once

I am trying to insert (in a mySQL database) datas from a "large" CSV file (3Mo / 37000 lines / 7 columns) using doctrine data fixtures.

The process is very slow and at this time I could not succeed (may be I had to wait a little bit more).

I suppose that doctrine data fixtures are not intended to manage such amount of datas ? Maybe the solution should be to import directly my csv into database ?

Any idea of how to proceed ?

Here is the code :

<?php

namespace FBN\GuideBundle\DataFixtures\ORM;

use Doctrine\Common\DataFixtures\AbstractFixture;
use Doctrine\Common\DataFixtures\OrderedFixtureInterface;
use Doctrine\Common\Persistence\ObjectManager;
use FBN\GuideBundle\Entity\CoordinatesFRCity as CoordFRCity;

class CoordinatesFRCity extends AbstractFixture implements OrderedFixtureInterface
{
    public function load(ObjectManager $manager)
    {
        $csv = fopen(dirname(__FILE__).'/Resources/Coordinates/CoordinatesFRCity.csv', 'r');

        $i = 0;

        while (!feof($csv)) {
            $line = fgetcsv($csv);

            $coordinatesfrcity[$i] = new CoordFRCity();
            $coordinatesfrcity[$i]->setAreaPre2016($line[0]);
            $coordinatesfrcity[$i]->setAreaPost2016($line[1]);
            $coordinatesfrcity[$i]->setDeptNum($line[2]);
            $coordinatesfrcity[$i]->setDeptName($line[3]);
            $coordinatesfrcity[$i]->setdistrict($line[4]);
            $coordinatesfrcity[$i]->setpostCode($line[5]);
            $coordinatesfrcity[$i]->setCity($line[6]);

            $manager->persist($coordinatesfrcity[$i]);

            $this->addReference('coordinatesfrcity-'.$i, $coordinatesfrcity[$i]);


            $i = $i + 1;
        }

        fclose($csv);

        $manager->flush();
    }

    public function getOrder()
    {
        return 1;
    }
}

Two rules to follow when you create big batch imports like this:

  • Disable SQL Logging: ($manager->getConnection()->getConfiguration()->setSQLLogger(null);) to avoid huge memory loss.

  • Flush and clear frequently instead of only once at the end. I suggest you add if ($i % 25 == 0) { $manager->flush(); $manager->clear() } inside your loop, to flush every 25 INSERTs.

EDIT: One last thing I forgot: don't keep your entities inside variables when you don't need them anymore. Here, in your loop, you only need the current entity that is being processed, so don't store previous entity in a $coordinatesfrcity array. This might lead you to memory overflow if you keep doing that.

  • 13
Reply Report
    • Thanks. I had tried to flush at each insert but it was a too important frequency I suppose. I'll try your proposal. Sorry but what is SQL Logging ?
      • 2
    • Doctrine comes with a logging system that keeps traces of executed SQL queries. In this context, you won't need it, and it will be a useless cost of memory.
      • 1
    • Ok I understand what is SQL logging. And you are right, this table come from another piece of code and is not necessary here.
      • 2
    • Just unbelievable. I was importing a CSV with 14 columns & 453000 lines (200Mo). The script took 48h to import all the rows. With this optimization, it takes only 10 minutes. That's freaking awesome. Thanks a lot!

There is a great example in the Docs: http://doctrine-orm.readthedocs.org/projects/doctrine-orm/en/latest/reference/batch-processing.html

Use a modulo (x % y) expression to implement batch processing, this example will insert 20 at a time. You may be able to optimise this depending on your server.

$batchSize = 20;
for ($i = 1; $i <= 10000; ++$i) {
    $user = new CmsUser;
    $user->setStatus('user');
    $user->setUsername('user' . $i);
    $user->setName('Mr.Smith-' . $i);
    $em->persist($user);
    if (($i % $batchSize) === 0) {
        $em->flush();
        $em->clear(); // Detaches all objects from Doctrine!
    }
}
$em->flush(); //Persist objects that did not make up an entire batch
$em->clear();
  • 1
Reply Report

For fixtures which need lots of memory but don't depend on each other, I get around this problem by using the append flag to insert one entity (or smaller group of entities) at a time:

bin/console doctrine:fixtures:load --fixtures="memory_hungry_fixture.file" --append

Then I write a Bash script which runs that command as many times as I need.

In your case, you could extend the Fixtures command and have a flag which does batches of entities - the first 1000 rows, then the 2nd 1000, etc.

  • 1
Reply Report

Warm tip !!!

This article is reproduced from Stack Exchange / Stack Overflow, please click

Trending Tags

Related Questions