Java Community | Help. Code. Learn.•7mo ago

CNN for 1x8x8 Matrix outputs weird values?

I'm currently trying to create a CNN network with deeplearning4j which is supposed to evaluate chess games, it has a double[][] with one channel as Input and a single double value as output. My training input data consists of double[][] onjects which are filled with the numbers -6.0 to 6.0 for the chess figures (-6.0 mean black king, 6.0 mean white king, 0.0 is an empty field) and the output data are centipawns, usually between -1000 and 1000 (negative numbers stand for an advantage for black, positive for white). Tho there are some centipawns beyond that like -20000 if someone has a really extreme advantage and even the maximum integer valua if there's a checkmate. Now I try to train my CNN to calculate a centipawn for a Chess board, tho it only outputs positive insanly high numbers like 1.0E9... Anybody knows how to fix that or what I could try?

38 Replies

JavaBot•7mo ago

⌛ This post has been reserved for your question.

Hey @Lloyd_159! Please use /close or the Close Post button above when your problem is solved. Please remember to follow the help guidelines. This post will be automatically closed after 300 minutes of inactivity.

TIP: Narrow down your issue to simple and precise questions to maximize the chance that others will reply in here.

dan1st•7mo ago

I would make all outputs between 0 and 1 or between -1 and 1 Also how much training data do you have?

Lloyd_159OP•7mo ago

technically one million games but I am currently just using 20.000 so then the max integer limit would be 1 and a cp of 100 would be like 0.0001 or something?

dan1st•7mo ago

yeah something like that you can also use the sigmoid function for that

Lloyd_159OP•7mo ago

okay, I'm slightly confused. I mean I'm already applying a normalizer for the values between -1 and 1 is that the same thing?

dan1st•7mo ago

that means the NN is only able to predict values between -1 and 1? or maybe it also applies the normalization for your results Anyways can I see the code?

Lloyd_159OP•7mo ago

that's the part where I am using the normalizer in the training

for (int i = 0; i < trainingData.length; i++) {
  INDArray input = Nd4j.create(trainingData[i]);
  INDArray label = Nd4j.create(new double[]{trainingCps[i]}).reshape(1, 1);
  trainData.add(new DataSet(input.reshape(1, 1, 8, 8), label));
}
DataSetIterator iterator = new ListDataSetIterator<>(trainData, 32);

NormalizerMinMaxScaler scaler = new NormalizerMinMaxScaler(-1, 1);
scaler.fit(iterator);
        iterator.setPreProcessor(scaler);

for (int i = 0; i < trainingData.length; i++) {
  INDArray input = Nd4j.create(trainingData[i]);
  INDArray label = Nd4j.create(new double[]{trainingCps[i]}).reshape(1, 1);
  trainData.add(new DataSet(input.reshape(1, 1, 8, 8), label));
}
DataSetIterator iterator = new ListDataSetIterator<>(trainData, 32);

NormalizerMinMaxScaler scaler = new NormalizerMinMaxScaler(-1, 1);
scaler.fit(iterator);
        iterator.setPreProcessor(scaler);

dan1st•7mo ago

that might be only for the input

Lloyd_159OP•7mo ago

training data is an double[][][] array for the input and trainingCps is a double[] array for the output, so they both should be included there, right? or do I have to seperate them?

dan1st•7mo ago

I think the NormalizerMinMaxScaler is just for the input Can you show the code of the model and how you train/run it?

Lloyd_159OP•7mo ago

I'll split it of into multiple messages since it's too big

public class TestCNN2 {
    private MultiLayerNetwork model;

    int inputNeurons = 1;
    int outputNeurons = 1;
    int neuronsInHiddenLayer1 = 32;
    int neuronsInHiddenLayer2 = 12;

    public TestCNN2() {
        MultiLayerConfiguration modelConf = new NeuralNetConfiguration.Builder()
                .seed(123)
                .dropOut(0.2)
                .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
                .updater(new Adam(0.001))
                .list()
                .layer(0, new ConvolutionLayer.Builder(3, 3)
                        .nIn(inputNeurons)
                        .name("firstConvolutionalLayer")
                        .stride(1, 1)
                        .nOut(12)
                        .activation(Activation.RELU)
                        .build())
                .layer(1, new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.MAX)
                        .name("firstPoolingLayer")
                        .kernelSize(2, 2)
                        .stride(2, 2)
                        .build())
                .layer(2, new ConvolutionLayer.Builder(3, 3)
                        .name("secondConvolutionalLayer")
                        .stride(1, 1)
                        .nOut(24)
                        .activation(Activation.RELU)
                        .build())
                .layer(3, new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.MAX)
                        .name("secondPoolingLayer")
                        .kernelSize(2, 2)
                        .stride(2, 2)
                        .build())
                .layer(4, new DenseLayer.Builder()
                        .name("firstDenseLayer")
                        .nOut(neuronsInHiddenLayer1)
                        .activation(Activation.RELU)
                        .build())

public class TestCNN2 {
    private MultiLayerNetwork model;

    int inputNeurons = 1;
    int outputNeurons = 1;
    int neuronsInHiddenLayer1 = 32;
    int neuronsInHiddenLayer2 = 12;

    public TestCNN2() {
        MultiLayerConfiguration modelConf = new NeuralNetConfiguration.Builder()
                .seed(123)
                .dropOut(0.2)
                .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
                .updater(new Adam(0.001))
                .list()
                .layer(0, new ConvolutionLayer.Builder(3, 3)
                        .nIn(inputNeurons)
                        .name("firstConvolutionalLayer")
                        .stride(1, 1)
                        .nOut(12)
                        .activation(Activation.RELU)
                        .build())
                .layer(1, new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.MAX)
                        .name("firstPoolingLayer")
                        .kernelSize(2, 2)
                        .stride(2, 2)
                        .build())
                .layer(2, new ConvolutionLayer.Builder(3, 3)
                        .name("secondConvolutionalLayer")
                        .stride(1, 1)
                        .nOut(24)
                        .activation(Activation.RELU)
                        .build())
                .layer(3, new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.MAX)
                        .name("secondPoolingLayer")
                        .kernelSize(2, 2)
                        .stride(2, 2)
                        .build())
                .layer(4, new DenseLayer.Builder()
                        .name("firstDenseLayer")
                        .nOut(neuronsInHiddenLayer1)
                        .activation(Activation.RELU)
                        .build())

                .layer(5, new DenseLayer.Builder()
                        .name("secondDenseLayer")
                        .nIn(neuronsInHiddenLayer1)
                        .nOut(neuronsInHiddenLayer2)
                        .activation(Activation.RELU)
                        .build())
                .layer(6, new OutputLayer.Builder(LossFunctions.LossFunction.MSE)
                        .nIn(neuronsInHiddenLayer2)
                        .activation(Activation.IDENTITY)
                        .nOut(outputNeurons)
                        .build())
                .setInputType(InputType.convolutional(8,8,1))
                .build();
        model = new MultiLayerNetwork(modelConf);
        model.init();
    }

    public void trainModel() {
        int epochs = 100;

        int dataLength = 1200;
        double[][][] trainingData = new double[dataLength][8][8];
        double[] trainingCps = new double[dataLength];
        ChessParser chessParser = new ChessParser();
        chessParser.readLines();

        for (int i = 0; i < trainingData.length; i++) {
            for (int j = 0; j < trainingData[i].length; j++) {
                for (int k = 0; k < trainingData[i][j].length; k++) {
                    trainingData[i][j][k] = chessParser.getGames().get(i).formatBoardAsMatrix()[j][k];
                    System.out.println(chessParser.getGames().get(i).formatBoardAsMatrix()[j][k]);
                }
            }
            trainingCps[i] = chessParser.getGames().get(i).getEvalCp();
            System.out.println("CP: " + trainingCps[i]);
            System.out.println();
            System.out.println();
        }

                .layer(5, new DenseLayer.Builder()
                        .name("secondDenseLayer")
                        .nIn(neuronsInHiddenLayer1)
                        .nOut(neuronsInHiddenLayer2)
                        .activation(Activation.RELU)
                        .build())
                .layer(6, new OutputLayer.Builder(LossFunctions.LossFunction.MSE)
                        .nIn(neuronsInHiddenLayer2)
                        .activation(Activation.IDENTITY)
                        .nOut(outputNeurons)
                        .build())
                .setInputType(InputType.convolutional(8,8,1))
                .build();
        model = new MultiLayerNetwork(modelConf);
        model.init();
    }

    public void trainModel() {
        int epochs = 100;

        int dataLength = 1200;
        double[][][] trainingData = new double[dataLength][8][8];
        double[] trainingCps = new double[dataLength];
        ChessParser chessParser = new ChessParser();
        chessParser.readLines();

        for (int i = 0; i < trainingData.length; i++) {
            for (int j = 0; j < trainingData[i].length; j++) {
                for (int k = 0; k < trainingData[i][j].length; k++) {
                    trainingData[i][j][k] = chessParser.getGames().get(i).formatBoardAsMatrix()[j][k];
                    System.out.println(chessParser.getGames().get(i).formatBoardAsMatrix()[j][k]);
                }
            }
            trainingCps[i] = chessParser.getGames().get(i).getEvalCp();
            System.out.println("CP: " + trainingCps[i]);
            System.out.println();
            System.out.println();
        }

        List<DataSet> trainData = new ArrayList<>();
        for (int i = 0; i < trainingData.length; i++) {
            INDArray input = Nd4j.create(trainingData[i]);
            INDArray label = Nd4j.create(new double[]{trainingCps[i]}).reshape(1, 1);
            trainData.add(new DataSet(input.reshape(1, 1, 8, 8), label));
        }
        DataSetIterator iterator = new ListDataSetIterator<>(trainData, 32);

        NormalizerMinMaxScaler scaler = new NormalizerMinMaxScaler(-1, 1);
        scaler.fit(iterator);
        iterator.setPreProcessor(scaler);

        model.setListeners(new ScoreIterationListener(10));

        for (int i = 0; i < epochs; i++) {
            model.fit(iterator);
            System.out.println("Epoch " + i);
        }
    }

    public double analyseGame(ChessGame pChessGame) {
        double[][] chessBoard = pChessGame.formatBoardAsMatrix();
        INDArray input = Nd4j.create(chessBoard).reshape(1, 1, 8, 8);
        INDArray output = model.output(input);
        return (output.getDouble(0));
    }

        List<DataSet> trainData = new ArrayList<>();
        for (int i = 0; i < trainingData.length; i++) {
            INDArray input = Nd4j.create(trainingData[i]);
            INDArray label = Nd4j.create(new double[]{trainingCps[i]}).reshape(1, 1);
            trainData.add(new DataSet(input.reshape(1, 1, 8, 8), label));
        }
        DataSetIterator iterator = new ListDataSetIterator<>(trainData, 32);

        NormalizerMinMaxScaler scaler = new NormalizerMinMaxScaler(-1, 1);
        scaler.fit(iterator);
        iterator.setPreProcessor(scaler);

        model.setListeners(new ScoreIterationListener(10));

        for (int i = 0; i < epochs; i++) {
            model.fit(iterator);
            System.out.println("Epoch " + i);
        }
    }

    public double analyseGame(ChessGame pChessGame) {
        double[][] chessBoard = pChessGame.formatBoardAsMatrix();
        INDArray input = Nd4j.create(chessBoard).reshape(1, 1, 8, 8);
        INDArray output = model.output(input);
        return (output.getDouble(0));
    }

    public void testModel() {
        String[][] testingBoardOne = {
                {"r", "n", "b", "q", "k", "b", "n", "r"},
                {"p", "p", "p", "p", "p", "p", "p", "p"},
                {null, null, null, null, null, null, null, null},
                {null, null, null, null, null, null, null, null},
                {null, null, null, null, null, null, null, null},
                {null, null, null, null, null, null, null, null},
                {"P", "P", "P", "P", "P", "P", "P", "P"},
                {"R", "N", "B", "Q", "K", "B", "N", "R"},
        };
        ChessGame testingGameOne = new ChessGame(testingBoardOne, 'w', "-", "-", "-", 0);
        System.out.println("Errechnetes Testergebnis 1 (Idialwert " + testingGameOne.getEvalCp() + "): Centipawn="
                + analyseGame(testingGameOne));
        String[][] testingBoardTwo = {
                {"r", "n", "q", null, "k", "b", "r", null},
                {null, "p", null, null, "p", "p", null, "p"},
                {null, null, null, "p", "b", null, null, null},
                {"p", null, "p", null, null, null, null, null},
                {"p", null, null, null, null, "n", "P", null},
                {null, "P", null, null, null, null, null, null},
                {null, null, "P", "P", "P", "P", null, null},
                {null, null, null, null, "K", "B", null, "R"},
        };
        ChessGame testingGameTwo = new ChessGame(testingBoardTwo, 'w', "-", "-", "-", -400);
        System.out.println("Errechnetes Testergebnis 2x (Idialwert " + testingGameTwo.getEvalCp() + "): Centipawn="
                + analyseGame(testingGameTwo));

    public void testModel() {
        String[][] testingBoardOne = {
                {"r", "n", "b", "q", "k", "b", "n", "r"},
                {"p", "p", "p", "p", "p", "p", "p", "p"},
                {null, null, null, null, null, null, null, null},
                {null, null, null, null, null, null, null, null},
                {null, null, null, null, null, null, null, null},
                {null, null, null, null, null, null, null, null},
                {"P", "P", "P", "P", "P", "P", "P", "P"},
                {"R", "N", "B", "Q", "K", "B", "N", "R"},
        };
        ChessGame testingGameOne = new ChessGame(testingBoardOne, 'w', "-", "-", "-", 0);
        System.out.println("Errechnetes Testergebnis 1 (Idialwert " + testingGameOne.getEvalCp() + "): Centipawn="
                + analyseGame(testingGameOne));
        String[][] testingBoardTwo = {
                {"r", "n", "q", null, "k", "b", "r", null},
                {null, "p", null, null, "p", "p", null, "p"},
                {null, null, null, "p", "b", null, null, null},
                {"p", null, "p", null, null, null, null, null},
                {"p", null, null, null, null, "n", "P", null},
                {null, "P", null, null, null, null, null, null},
                {null, null, "P", "P", "P", "P", null, null},
                {null, null, null, null, "K", "B", null, "R"},
        };
        ChessGame testingGameTwo = new ChessGame(testingBoardTwo, 'w', "-", "-", "-", -400);
        System.out.println("Errechnetes Testergebnis 2x (Idialwert " + testingGameTwo.getEvalCp() + "): Centipawn="
                + analyseGame(testingGameTwo));

        String[][] testingBoardThree = {
                {"r", null, null, null, "k", "b", null, null},
                {"p", null, null, "q", "p", "p", "p", null},
                {null, "p", null, null, null, null, null, null},
                {null, "Q", "p", null, null, null, null, "p"},
                {null, null, "P", null, null, null, null, null},
                {null, null, "N", "P", null, null, null, null},
                {"P", null, null, null, "P", "P", "P", "P"},
                {"R", "N", "B", null, "K", "B", null, "R"},
        };
        ChessGame testingGameThree = new ChessGame(testingBoardThree, 'w', "-", "-", "-", 400);
        System.out.println("Errechnetes Testergebnis 3 (Idialwert " + testingGameThree.getEvalCp() + "): Centipawn="
                + analyseGame(testingGameThree));
    }

    public static void main(String[] args) {
        TestCNN2 testCNN = new TestCNN2();
        testCNN.trainModel();
        testCNN.testModel();
    }
}

        String[][] testingBoardThree = {
                {"r", null, null, null, "k", "b", null, null},
                {"p", null, null, "q", "p", "p", "p", null},
                {null, "p", null, null, null, null, null, null},
                {null, "Q", "p", null, null, null, null, "p"},
                {null, null, "P", null, null, null, null, null},
                {null, null, "N", "P", null, null, null, null},
                {"P", null, null, null, "P", "P", "P", "P"},
                {"R", "N", "B", null, "K", "B", null, "R"},
        };
        ChessGame testingGameThree = new ChessGame(testingBoardThree, 'w', "-", "-", "-", 400);
        System.out.println("Errechnetes Testergebnis 3 (Idialwert " + testingGameThree.getEvalCp() + "): Centipawn="
                + analyseGame(testingGameThree));
    }

    public static void main(String[] args) {
        TestCNN2 testCNN = new TestCNN2();
        testCNN.trainModel();
        testCNN.testModel();
    }
}

that's it it might have been easier to upload it to pastebin or smth when I think about it

Lloyd_159OP•7mo ago

https://pastebin.com/NpwbuqAF

Pastebin

import org.deeplearning4j.datasets.iterator.utilty.ListDataSetItera...

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

dan1st•7mo ago

You are scaling the training data but not in analyseGame

Lloyd_159OP•7mo ago

oooh I see

dan1st•7mo ago

Make sure you are using the same NormalizerMinMaxScaler

Lloyd_159OP•7mo ago

I think I am now the program is running to give me the next output, it might take a while tho I updated to analyseGame() Method to this:

public double analyseGame(ChessGame pChessGame) {
    double[][] chessBoard = pChessGame.formatBoardAsMatrix();
    INDArray input = Nd4j.create(chessBoard).reshape(1, 1, 8, 8);
    scaler.transform(input);
    INDArray output = model.output(input);
    return (output.getDouble(0));
}

public double analyseGame(ChessGame pChessGame) {
    double[][] chessBoard = pChessGame.formatBoardAsMatrix();
    INDArray input = Nd4j.create(chessBoard).reshape(1, 1, 8, 8);
    scaler.transform(input);
    INDArray output = model.output(input);
    return (output.getDouble(0));
}

and moved NormalizerMinMaxScaler scaler = new NormalizerMinMaxScaler(-1, 1); to the top attributes so that and just fit and apply it during the training scaler.fit(iterator);, so it should be the same imo. Tho, the outputs are still the same: 3.6757416E7 3.8842176E7 5.0122312E7

dan1st•7mo ago

Are you saving the model between training and inference? these output seem like it's trying to get the maximum

Lloyd_159OP•7mo ago

you mean like private MultiLayerNetwork model at the top? yes

dan1st•7mo ago

ok that shouldn't be the problem How many epochs are you using?

Lloyd_159OP•7mo ago

100 currently

dan1st•7mo ago

what if you use something that's already in the training data and not a checkmate for testing?

Lloyd_159OP•7mo ago

btw is it normal that it takes 5 seconds for each epoc during the training? I mean that's almost 10 minutes each time I run the program the first chess board of the training data ist trained with an output of 58.0 and also outputs 5.111752E7 when testing

dan1st•7mo ago

yes Can you try running tanh on the output and check the result?

Lloyd_159OP•7mo ago

well something changed now every result is 0.99999 and something 0.9999973177909851, 0.9999986290931702, 0.9999977946281433 and 0.9999997615814209 no clue if this is progress or not, it's supposed to be 0, -400, 400 and 58

dan1st•7mo ago

ok so it didn't really help much the only thing tanh is doing is making sure a number is between -1 and 1 In your training data, how many positions do you have that are infinity? How many are negative infinity? and how many are some other values? Try not doing anything with infinity maybe but use specific values for the output during training e.g. change the target values so that everything is between -1 and 1

Lloyd_159OP•7mo ago

uhm no clue how I should do this honestly it's pretty late for me, I might try to do this tomorrow

dan1st•7mo ago

When you fill your trainingCps, make sure all are between -1 and 1 (or between 0 and 1 if that works better for you). For example, you could make it such that 1 is a checkmate for player 1, -1 is a checkmate for player 2, 0 is neutral and everything between it is an advantage for one or the other player I'm in the same timezone as you xd

JavaBot•7mo ago

💤 Post marked as dormant

This post has been inactive for over 300 minutes, thus, it has been archived. If your question was not answered yet, feel free to re-open this post or create a new one. In case your post is not getting any attention, you can try to use /help ping. Warning: abusing this will result in moderative actions taken against you.

Lloyd_159OP•7mo ago

isn't that exactly what the NormalizerMinMaxScaler(-1, 1) already does? btw I tried to run the network with just one training data with 1000 epochs and tested the same matrix as input afterwards, the expected the result is 58.0 but I got 14.09 then I changed the seed and got 125.43 so ig the network currently doesn't train at all?

dan1st•7mo ago

on the input I think well you cannot really exxpext to get the right result with a NN

Lloyd_159OP•7mo ago

so I basically should try to seperate the scaler between input and output? well but I mean it should at least be pretty close to the right result

dan1st•7mo ago

oh also I think that a min max scaler is not a good idea if you use infinity as a possible value

Lloyd_159OP•7mo ago

like if I train it with two matrixes which are the exact opposite of each other and are trained with an output of 100 and -100, if I test the network again with one of these two matrixes the NN should at least get positive/negative output right ig with infinity you mean up to max integer value, right?

dan1st•7mo ago

yes

Lloyd_159OP•7mo ago

so which one can I use then?

dan1st•7mo ago

try using values between -1 and 1 that make sense for you consistently NNs don't really like infinity/max int It would be possible to do something for that with tanh/sigmoid or similar

Lloyd_159OP•7mo ago

okay thanks, I'll have a look into that then

JavaBot•7mo ago

💤 Post marked as dormant

This post has been inactive for over 300 minutes, thus, it has been archived. If your question was not answered yet, feel free to re-open this post or create a new one. In case your post is not getting any attention, you can try to use /help ping. Warning: abusing this will result in moderative actions taken against you.

Gaming

Programming

CNN for 1x8x8 Matrix outputs weird values?

Did you find this page helpful?