| end of split 1 / 18 | epoch 1 | time: 2463.04s | valid loss 3.72 | valid ppl 41.29 | learning rate 20.0000 | end of split 2 / 18 | epoch 1 | time: 2473.78s | valid loss 2.78 | valid ppl 16.15 | learning rate 20.0000 | end of split 3 / 18 | epoch 1 | time: 2477.78s | valid loss 2.00 | valid ppl 7.40 | learning rate 20.0000 | end of split 4 / 18 | epoch 1 | time: 2473.17s | valid loss 1.85 | valid ppl 6.37 | learning rate 20.0000 | end of split 5 / 18 | epoch 1 | time: 2483.12s | valid loss 1.77 | valid ppl 5.85 | learning rate 20.0000 | end of split 6 / 18 | epoch 1 | time: 2472.16s | valid loss 1.72 | valid ppl 5.61 | learning rate 20.0000 | end of split 7 / 18 | epoch 1 | time: 2474.96s | valid loss 1.67 | valid ppl 5.33 | learning rate 20.0000 | end of split 8 / 18 | epoch 1 | time: 2479.35s | valid loss 1.64 | valid ppl 5.18 | learning rate 20.0000 | end of split 9 / 18 | epoch 1 | time: 2473.92s | valid loss 1.62 | valid ppl 5.08 | learning rate 20.0000 | end of split 10 / 18 | epoch 1 | time: 2473.96s | valid loss 1.61 | valid ppl 5.00 | learning rate 20.0000 | end of split 11 / 18 | epoch 1 | time: 2478.82s | valid loss 1.59 | valid ppl 4.88 | learning rate 20.0000 | end of split 12 / 18 | epoch 1 | time: 2478.80s | valid loss 1.57 | valid ppl 4.82 | learning rate 20.0000 | end of split 13 / 18 | epoch 1 | time: 2477.06s | valid loss 1.56 | valid ppl 4.75 | learning rate 20.0000 | end of split 14 / 18 | epoch 1 | time: 2472.12s | valid loss 1.55 | valid ppl 4.73 | learning rate 20.0000 | end of split 15 / 18 | epoch 1 | time: 2480.09s | valid loss 1.55 | valid ppl 4.73 | learning rate 20.0000 | end of split 16 / 18 | epoch 1 | time: 2479.83s | valid loss 1.53 | valid ppl 4.61 | learning rate 20.0000 | end of split 17 / 18 | epoch 1 | time: 2480.84s | valid loss 1.53 | valid ppl 4.60 | learning rate 20.0000 | end of split 18 / 18 | epoch 1 | time: 2483.16s | valid loss 1.51 | valid ppl 4.53 | learning rate 20.0000 | end of split 1 / 18 | epoch 2 | time: 2475.07s | valid loss 1.51 | valid ppl 4.52 | learning rate 20.0000 | end of split 2 / 18 | epoch 2 | time: 2477.07s | valid loss 1.50 | valid ppl 4.48 | learning rate 20.0000 | end of split 3 / 18 | epoch 2 | time: 2477.53s | valid loss 1.49 | valid ppl 4.44 | learning rate 20.0000 | end of split 4 / 18 | epoch 2 | time: 2475.24s | valid loss 1.49 | valid ppl 4.42 | learning rate 20.0000 | end of split 5 / 18 | epoch 2 | time: 2474.42s | valid loss 1.48 | valid ppl 4.39 | learning rate 20.0000 | end of split 6 / 18 | epoch 2 | time: 2487.55s | valid loss 1.48 | valid ppl 4.38 | learning rate 20.0000 | end of split 7 / 18 | epoch 2 | time: 2481.33s | valid loss 1.47 | valid ppl 4.35 | learning rate 20.0000 | end of split 8 / 18 | epoch 2 | time: 2485.77s | valid loss 1.47 | valid ppl 4.34 | learning rate 20.0000 | end of split 9 / 18 | epoch 2 | time: 2470.32s | valid loss 1.47 | valid ppl 4.34 | learning rate 20.0000 | end of split 10 / 18 | epoch 2 | time: 2478.09s | valid loss 1.46 | valid ppl 4.29 | learning rate 20.0000 | end of split 11 / 18 | epoch 2 | time: 2482.80s | valid loss 1.46 | valid ppl 4.31 | learning rate 20.0000 | end of split 12 / 18 | epoch 2 | time: 2481.42s | valid loss 1.45 | valid ppl 4.27 | learning rate 20.0000 | end of split 13 / 18 | epoch 2 | time: 2475.03s | valid loss 1.45 | valid ppl 4.26 | learning rate 20.0000 | end of split 14 / 18 | epoch 2 | time: 2482.40s | valid loss 1.45 | valid ppl 4.27 | learning rate 20.0000 | end of split 15 / 18 | epoch 2 | time: 2477.23s | valid loss 1.44 | valid ppl 4.23 | learning rate 20.0000 | end of split 16 / 18 | epoch 2 | time: 2476.85s | valid loss 1.44 | valid ppl 4.23 | learning rate 20.0000 | end of split 17 / 18 | epoch 2 | time: 2479.52s | valid loss 1.43 | valid ppl 4.20 | learning rate 20.0000 | end of split 18 / 18 | epoch 2 | time: 2479.58s | valid loss 1.43 | valid ppl 4.19 | learning rate 20.0000 | end of split 1 / 18 | epoch 3 | time: 2479.08s | valid loss 1.43 | valid ppl 4.17 | learning rate 20.0000 | end of split 2 / 18 | epoch 3 | time: 2475.10s | valid loss 1.43 | valid ppl 4.18 | learning rate 20.0000 | end of split 3 / 18 | epoch 3 | time: 2474.14s | valid loss 1.42 | valid ppl 4.16 | learning rate 20.0000 | end of split 4 / 18 | epoch 3 | time: 2487.01s | valid loss 1.42 | valid ppl 4.15 | learning rate 20.0000 | end of split 5 / 18 | epoch 3 | time: 2478.80s | valid loss 1.42 | valid ppl 4.15 | learning rate 20.0000 | end of split 6 / 18 | epoch 3 | time: 2484.73s | valid loss 1.42 | valid ppl 4.15 | learning rate 20.0000 | end of split 7 / 18 | epoch 3 | time: 2477.17s | valid loss 1.42 | valid ppl 4.14 | learning rate 20.0000 | end of split 8 / 18 | epoch 3 | time: 2486.03s | valid loss 1.41 | valid ppl 4.11 | learning rate 20.0000 | end of split 9 / 18 | epoch 3 | time: 2477.59s | valid loss 1.41 | valid ppl 4.11 | learning rate 20.0000 | end of split 10 / 18 | epoch 3 | time: 2481.07s | valid loss 1.41 | valid ppl 4.10 | learning rate 20.0000 | end of split 11 / 18 | epoch 3 | time: 2477.34s | valid loss 1.41 | valid ppl 4.11 | learning rate 20.0000 | end of split 12 / 18 | epoch 3 | time: 2483.40s | valid loss 1.41 | valid ppl 4.08 | learning rate 20.0000 | end of split 13 / 18 | epoch 3 | time: 2477.94s | valid loss 1.41 | valid ppl 4.10 | learning rate 20.0000 | end of split 14 / 18 | epoch 3 | time: 2479.40s | valid loss 1.40 | valid ppl 4.07 | learning rate 20.0000 | end of split 15 / 18 | epoch 3 | time: 2478.58s | valid loss 1.40 | valid ppl 4.06 | learning rate 20.0000 | end of split 16 / 18 | epoch 3 | time: 2479.81s | valid loss 1.40 | valid ppl 4.05 | learning rate 20.0000 | end of split 17 / 18 | epoch 3 | time: 2479.38s | valid loss 1.40 | valid ppl 4.05 | learning rate 20.0000 | end of split 18 / 18 | epoch 3 | time: 2476.99s | valid loss 1.40 | valid ppl 4.05 | learning rate 20.0000 | end of split 1 / 18 | epoch 4 | time: 2472.35s | valid loss 1.40 | valid ppl 4.04 | learning rate 20.0000 | end of split 2 / 18 | epoch 4 | time: 2474.84s | valid loss 1.40 | valid ppl 4.04 | learning rate 20.0000 | end of split 3 / 18 | epoch 4 | time: 2482.54s | valid loss 1.39 | valid ppl 4.03 | learning rate 20.0000 | end of split 4 / 18 | epoch 4 | time: 2477.84s | valid loss 1.40 | valid ppl 4.04 | learning rate 20.0000 | end of split 5 / 18 | epoch 4 | time: 2477.45s | valid loss 1.39 | valid ppl 4.03 | learning rate 20.0000 | end of split 6 / 18 | epoch 4 | time: 2476.53s | valid loss 1.39 | valid ppl 4.02 | learning rate 20.0000 | end of split 7 / 18 | epoch 4 | time: 2476.24s | valid loss 1.39 | valid ppl 4.01 | learning rate 20.0000 | end of split 8 / 18 | epoch 4 | time: 2473.60s | valid loss 1.39 | valid ppl 4.01 | learning rate 20.0000 | end of split 9 / 18 | epoch 4 | time: 2472.53s | valid loss 1.39 | valid ppl 4.00 | learning rate 20.0000 | end of split 10 / 18 | epoch 4 | time: 2478.04s | valid loss 1.38 | valid ppl 3.99 | learning rate 20.0000 | end of split 11 / 18 | epoch 4 | time: 2469.73s | valid loss 1.38 | valid ppl 3.99 | learning rate 20.0000 | end of split 12 / 18 | epoch 4 | time: 2475.83s | valid loss 1.38 | valid ppl 3.98 | learning rate 20.0000 | end of split 13 / 18 | epoch 4 | time: 2482.93s | valid loss 1.38 | valid ppl 3.98 | learning rate 20.0000 | end of split 14 / 18 | epoch 4 | time: 2480.32s | valid loss 1.38 | valid ppl 3.98 | learning rate 20.0000 | end of split 15 / 18 | epoch 4 | time: 2481.41s | valid loss 1.38 | valid ppl 3.97 | learning rate 20.0000 | end of split 16 / 18 | epoch 4 | time: 2479.65s | valid loss 1.38 | valid ppl 3.96 | learning rate 20.0000 | end of split 17 / 18 | epoch 4 | time: 2483.55s | valid loss 1.38 | valid ppl 3.96 | learning rate 20.0000 | end of split 18 / 18 | epoch 4 | time: 2482.09s | valid loss 1.38 | valid ppl 3.97 | learning rate 20.0000 | end of split 1 / 18 | epoch 5 | time: 2480.08s | valid loss 1.37 | valid ppl 3.95 | learning rate 20.0000 | end of split 2 / 18 | epoch 5 | time: 2483.48s | valid loss 1.38 | valid ppl 3.96 | learning rate 20.0000 | end of split 3 / 18 | epoch 5 | time: 2475.94s | valid loss 1.37 | valid ppl 3.95 | learning rate 20.0000 | end of split 4 / 18 | epoch 5 | time: 2487.30s | valid loss 1.37 | valid ppl 3.95 | learning rate 20.0000 | end of split 5 / 18 | epoch 5 | time: 2484.14s | valid loss 1.37 | valid ppl 3.94 | learning rate 20.0000 | end of split 6 / 18 | epoch 5 | time: 2480.92s | valid loss 1.37 | valid ppl 3.93 | learning rate 20.0000 | end of split 7 / 18 | epoch 5 | time: 2480.03s | valid loss 1.37 | valid ppl 3.93 | learning rate 20.0000 | end of split 8 / 18 | epoch 5 | time: 2474.85s | valid loss 1.37 | valid ppl 3.94 | learning rate 20.0000 | end of split 9 / 18 | epoch 5 | time: 2488.15s | valid loss 1.37 | valid ppl 3.93 | learning rate 20.0000 | end of split 10 / 18 | epoch 5 | time: 2480.64s | valid loss 1.37 | valid ppl 3.92 | learning rate 20.0000 | end of split 11 / 18 | epoch 5 | time: 2474.80s | valid loss 1.37 | valid ppl 3.93 | learning rate 20.0000 | end of split 12 / 18 | epoch 5 | time: 2477.97s | valid loss 1.37 | valid ppl 3.92 | learning rate 20.0000 | end of split 13 / 18 | epoch 5 | time: 2477.98s | valid loss 1.36 | valid ppl 3.91 | learning rate 20.0000 | end of split 14 / 18 | epoch 5 | time: 2474.99s | valid loss 1.36 | valid ppl 3.91 | learning rate 20.0000 | end of split 15 / 18 | epoch 5 | time: 2478.68s | valid loss 1.36 | valid ppl 3.90 | learning rate 20.0000 | end of split 16 / 18 | epoch 5 | time: 2475.34s | valid loss 1.36 | valid ppl 3.91 | learning rate 20.0000 | end of split 17 / 18 | epoch 5 | time: 2487.33s | valid loss 1.37 | valid ppl 3.92 | learning rate 20.0000 | end of split 18 / 18 | epoch 5 | time: 2483.16s | valid loss 1.36 | valid ppl 3.90 | learning rate 20.0000 | end of split 1 / 18 | epoch 6 | time: 2482.21s | valid loss 1.36 | valid ppl 3.90 | learning rate 20.0000 | end of split 2 / 18 | epoch 6 | time: 2489.03s | valid loss 1.36 | valid ppl 3.90 | learning rate 20.0000 | end of split 3 / 18 | epoch 6 | time: 2493.54s | valid loss 1.36 | valid ppl 3.90 | learning rate 20.0000 | end of split 4 / 18 | epoch 6 | time: 2489.15s | valid loss 1.36 | valid ppl 3.89 | learning rate 20.0000 | end of split 5 / 18 | epoch 6 | time: 2500.92s | valid loss 1.36 | valid ppl 3.88 | learning rate 20.0000 | end of split 6 / 18 | epoch 6 | time: 2500.51s | valid loss 1.36 | valid ppl 3.88 | learning rate 20.0000 | end of split 7 / 18 | epoch 6 | time: 2489.40s | valid loss 1.36 | valid ppl 3.88 | learning rate 20.0000 | end of split 8 / 18 | epoch 6 | time: 2490.15s | valid loss 1.36 | valid ppl 3.88 | learning rate 20.0000 | end of split 9 / 18 | epoch 6 | time: 2483.74s | valid loss 1.36 | valid ppl 3.88 | learning rate 20.0000 | end of split 10 / 18 | epoch 6 | time: 2486.09s | valid loss 1.35 | valid ppl 3.87 | learning rate 20.0000 | end of split 11 / 18 | epoch 6 | time: 2482.55s | valid loss 1.35 | valid ppl 3.88 | learning rate 20.0000 | end of split 12 / 18 | epoch 6 | time: 2480.15s | valid loss 1.35 | valid ppl 3.87 | learning rate 20.0000 | end of split 13 / 18 | epoch 6 | time: 2489.01s | valid loss 1.35 | valid ppl 3.87 | learning rate 20.0000 | end of split 14 / 18 | epoch 6 | time: 2478.79s | valid loss 1.35 | valid ppl 3.87 | learning rate 20.0000 | end of split 15 / 18 | epoch 6 | time: 2488.40s | valid loss 1.35 | valid ppl 3.86 | learning rate 20.0000 | end of split 16 / 18 | epoch 6 | time: 2478.96s | valid loss 1.35 | valid ppl 3.86 | learning rate 20.0000 | end of split 17 / 18 | epoch 6 | time: 2489.03s | valid loss 1.35 | valid ppl 3.87 | learning rate 20.0000 | end of split 18 / 18 | epoch 6 | time: 2490.31s | valid loss 1.35 | valid ppl 3.85 | learning rate 20.0000 | end of split 1 / 18 | epoch 7 | time: 2481.27s | valid loss 1.35 | valid ppl 3.85 | learning rate 20.0000 | end of split 2 / 18 | epoch 7 | time: 2480.16s | valid loss 1.35 | valid ppl 3.85 | learning rate 20.0000 | end of split 3 / 18 | epoch 7 | time: 2494.82s | valid loss 1.35 | valid ppl 3.85 | learning rate 20.0000 | end of split 4 / 18 | epoch 7 | time: 2487.06s | valid loss 1.35 | valid ppl 3.85 | learning rate 20.0000 | end of split 5 / 18 | epoch 7 | time: 2489.29s | valid loss 1.35 | valid ppl 3.86 | learning rate 20.0000 | end of split 6 / 18 | epoch 7 | time: 2484.94s | valid loss 1.35 | valid ppl 3.85 | learning rate 20.0000 | end of split 7 / 18 | epoch 7 | time: 2489.74s | valid loss 1.35 | valid ppl 3.84 | learning rate 20.0000 | end of split 8 / 18 | epoch 7 | time: 2489.68s | valid loss 1.35 | valid ppl 3.84 | learning rate 20.0000 | end of split 9 / 18 | epoch 7 | time: 2484.14s | valid loss 1.35 | valid ppl 3.84 | learning rate 20.0000 | end of split 10 / 18 | epoch 7 | time: 2482.77s | valid loss 1.34 | valid ppl 3.84 | learning rate 20.0000 | end of split 11 / 18 | epoch 7 | time: 2491.34s | valid loss 1.34 | valid ppl 3.84 | learning rate 20.0000 | end of split 12 / 18 | epoch 7 | time: 2483.82s | valid loss 1.35 | valid ppl 3.84 | learning rate 20.0000 | end of split 13 / 18 | epoch 7 | time: 2484.59s | valid loss 1.34 | valid ppl 3.83 | learning rate 20.0000 | end of split 14 / 18 | epoch 7 | time: 2485.52s | valid loss 1.34 | valid ppl 3.83 | learning rate 20.0000 | end of split 15 / 18 | epoch 7 | time: 2485.12s | valid loss 1.34 | valid ppl 3.82 | learning rate 20.0000 | end of split 16 / 18 | epoch 7 | time: 2482.05s | valid loss 1.34 | valid ppl 3.83 | learning rate 20.0000 | end of split 17 / 18 | epoch 7 | time: 2487.01s | valid loss 1.34 | valid ppl 3.82 | learning rate 20.0000 | end of split 18 / 18 | epoch 7 | time: 2477.03s | valid loss 1.34 | valid ppl 3.83 | learning rate 20.0000 | end of split 1 / 18 | epoch 8 | time: 2484.28s | valid loss 1.34 | valid ppl 3.82 | learning rate 20.0000 | end of split 2 / 18 | epoch 8 | time: 2481.68s | valid loss 1.34 | valid ppl 3.83 | learning rate 20.0000 | end of split 3 / 18 | epoch 8 | time: 2487.26s | valid loss 1.34 | valid ppl 3.82 | learning rate 20.0000 | end of split 4 / 18 | epoch 8 | time: 2483.60s | valid loss 1.34 | valid ppl 3.83 | learning rate 20.0000 | end of split 5 / 18 | epoch 8 | time: 2487.25s | valid loss 1.34 | valid ppl 3.82 | learning rate 20.0000 | end of split 6 / 18 | epoch 8 | time: 2486.95s | valid loss 1.34 | valid ppl 3.82 | learning rate 20.0000 | end of split 7 / 18 | epoch 8 | time: 2483.94s | valid loss 1.34 | valid ppl 3.81 | learning rate 20.0000 | end of split 8 / 18 | epoch 8 | time: 2491.45s | valid loss 1.34 | valid ppl 3.81 | learning rate 20.0000 | end of split 9 / 18 | epoch 8 | time: 2491.08s | valid loss 1.34 | valid ppl 3.81 | learning rate 20.0000 | end of split 10 / 18 | epoch 8 | time: 2494.76s | valid loss 1.34 | valid ppl 3.81 | learning rate 20.0000 | end of split 11 / 18 | epoch 8 | time: 2490.90s | valid loss 1.34 | valid ppl 3.81 | learning rate 20.0000 | end of split 12 / 18 | epoch 8 | time: 2495.73s | valid loss 1.34 | valid ppl 3.81 | learning rate 20.0000 | end of split 13 / 18 | epoch 8 | time: 2488.95s | valid loss 1.34 | valid ppl 3.80 | learning rate 20.0000 | end of split 14 / 18 | epoch 8 | time: 2499.30s | valid loss 1.33 | valid ppl 3.80 | learning rate 20.0000 | end of split 15 / 18 | epoch 8 | time: 2493.14s | valid loss 1.33 | valid ppl 3.80 | learning rate 20.0000 | end of split 16 / 18 | epoch 8 | time: 2486.92s | valid loss 1.34 | valid ppl 3.80 | learning rate 20.0000 | end of split 17 / 18 | epoch 8 | time: 2487.28s | valid loss 1.33 | valid ppl 3.79 | learning rate 20.0000 | end of split 18 / 18 | epoch 8 | time: 2486.85s | valid loss 1.33 | valid ppl 3.79 | learning rate 20.0000 | end of split 1 / 18 | epoch 9 | time: 2488.58s | valid loss 1.33 | valid ppl 3.80 | learning rate 20.0000 | end of split 2 / 18 | epoch 9 | time: 2479.71s | valid loss 1.33 | valid ppl 3.80 | learning rate 20.0000 | end of split 3 / 18 | epoch 9 | time: 2492.13s | valid loss 1.33 | valid ppl 3.79 | learning rate 20.0000 | end of split 4 / 18 | epoch 9 | time: 2488.77s | valid loss 1.34 | valid ppl 3.80 | learning rate 20.0000 | end of split 5 / 18 | epoch 9 | time: 2486.55s | valid loss 1.33 | valid ppl 3.80 | learning rate 20.0000 | end of split 6 / 18 | epoch 9 | time: 2482.04s | valid loss 1.33 | valid ppl 3.80 | learning rate 20.0000 | end of split 7 / 18 | epoch 9 | time: 2492.95s | valid loss 1.33 | valid ppl 3.79 | learning rate 20.0000 | end of split 8 / 18 | epoch 9 | time: 2571.43s | valid loss 1.33 | valid ppl 3.78 | learning rate 20.0000 | end of split 9 / 18 | epoch 9 | time: 2492.71s | valid loss 1.33 | valid ppl 3.78 | learning rate 20.0000 | end of split 10 / 18 | epoch 9 | time: 2486.05s | valid loss 1.33 | valid ppl 3.79 | learning rate 20.0000 | end of split 11 / 18 | epoch 9 | time: 2491.21s | valid loss 1.33 | valid ppl 3.78 | learning rate 20.0000 | end of split 12 / 18 | epoch 9 | time: 2683.20s | valid loss 1.33 | valid ppl 3.78 | learning rate 20.0000 | end of split 13 / 18 | epoch 9 | time: 3476.32s | valid loss 1.33 | valid ppl 3.78 | learning rate 20.0000 | end of split 14 / 18 | epoch 9 | time: 3448.30s | valid loss 1.33 | valid ppl 3.77 | learning rate 20.0000 | end of split 15 / 18 | epoch 9 | time: 3523.94s | valid loss 1.33 | valid ppl 3.78 | learning rate 20.0000 | end of split 16 / 18 | epoch 9 | time: 3519.58s | valid loss 1.33 | valid ppl 3.78 | learning rate 20.0000 | end of split 17 / 18 | epoch 9 | time: 3512.81s | valid loss 1.33 | valid ppl 3.77 | learning rate 20.0000 | end of split 18 / 18 | epoch 9 | time: 3503.34s | valid loss 1.33 | valid ppl 3.77 | learning rate 20.0000 | end of split 1 / 18 | epoch 10 | time: 3445.22s | valid loss 1.33 | valid ppl 3.77 | learning rate 20.0000 | end of split 2 / 18 | epoch 10 | time: 3433.10s | valid loss 1.33 | valid ppl 3.78 | learning rate 20.0000 | end of split 3 / 18 | epoch 10 | time: 3444.95s | valid loss 1.33 | valid ppl 3.78 | learning rate 20.0000 | end of split 4 / 18 | epoch 10 | time: 3542.90s | valid loss 1.33 | valid ppl 3.77 | learning rate 20.0000 | end of split 5 / 18 | epoch 10 | time: 3561.31s | valid loss 1.33 | valid ppl 3.77 | learning rate 20.0000 | end of split 6 / 18 | epoch 10 | time: 3594.32s | valid loss 1.33 | valid ppl 3.77 | learning rate 20.0000 | end of split 7 / 18 | epoch 10 | time: 3492.49s | valid loss 1.33 | valid ppl 3.77 | learning rate 20.0000 | end of split 8 / 18 | epoch 10 | time: 3476.63s | valid loss 1.33 | valid ppl 3.76 | learning rate 20.0000 | end of split 9 / 18 | epoch 10 | time: 3443.35s | valid loss 1.33 | valid ppl 3.76 | learning rate 20.0000 | end of split 10 / 18 | epoch 10 | time: 3453.98s | valid loss 1.33 | valid ppl 3.76 | learning rate 20.0000 | end of split 11 / 18 | epoch 10 | time: 3464.32s | valid loss 1.32 | valid ppl 3.76 | learning rate 20.0000 | end of split 12 / 18 | epoch 10 | time: 3422.09s | valid loss 1.33 | valid ppl 3.77 | learning rate 20.0000 | end of split 13 / 18 | epoch 10 | time: 3452.29s | valid loss 1.32 | valid ppl 3.76 | learning rate 20.0000 | end of split 14 / 18 | epoch 10 | time: 3452.07s | valid loss 1.33 | valid ppl 3.76 | learning rate 20.0000 | end of split 15 / 18 | epoch 10 | time: 3447.27s | valid loss 1.32 | valid ppl 3.76 | learning rate 20.0000 | end of split 16 / 18 | epoch 10 | time: 3488.27s | valid loss 1.32 | valid ppl 3.75 | learning rate 20.0000 | end of split 17 / 18 | epoch 10 | time: 3466.22s | valid loss 1.32 | valid ppl 3.75 | learning rate 20.0000 | end of split 18 / 18 | epoch 10 | time: 3436.34s | valid loss 1.32 | valid ppl 3.75 | learning rate 20.0000 | end of split 1 / 18 | epoch 11 | time: 3465.54s | valid loss 1.32 | valid ppl 3.76 | learning rate 20.0000 | end of split 2 / 18 | epoch 11 | time: 3463.45s | valid loss 1.32 | valid ppl 3.76 | learning rate 20.0000 | end of split 3 / 18 | epoch 11 | time: 3505.13s | valid loss 1.32 | valid ppl 3.75 | learning rate 20.0000 | end of split 4 / 18 | epoch 11 | time: 3449.57s | valid loss 1.32 | valid ppl 3.76 | learning rate 20.0000 | end of split 5 / 18 | epoch 11 | time: 3485.90s | valid loss 1.32 | valid ppl 3.75 | learning rate 20.0000 | end of split 6 / 18 | epoch 11 | time: 3611.54s | valid loss 1.32 | valid ppl 3.75 | learning rate 20.0000 | end of split 7 / 18 | epoch 11 | time: 3522.56s | valid loss 1.32 | valid ppl 3.75 | learning rate 20.0000 | end of split 8 / 18 | epoch 11 | time: 3557.26s | valid loss 1.32 | valid ppl 3.75 | learning rate 20.0000 | end of split 9 / 18 | epoch 11 | time: 3491.95s | valid loss 1.32 | valid ppl 3.75 | learning rate 20.0000 | end of split 10 / 18 | epoch 11 | time: 3494.98s | valid loss 1.32 | valid ppl 3.75 | learning rate 20.0000 | end of split 11 / 18 | epoch 11 | time: 3481.46s | valid loss 1.32 | valid ppl 3.74 | learning rate 20.0000 | end of split 12 / 18 | epoch 11 | time: 3501.36s | valid loss 1.32 | valid ppl 3.75 | learning rate 20.0000 | end of split 13 / 18 | epoch 11 | time: 3495.56s | valid loss 1.32 | valid ppl 3.74 | learning rate 20.0000 | end of split 14 / 18 | epoch 11 | time: 3490.59s | valid loss 1.32 | valid ppl 3.74 | learning rate 20.0000 | end of split 15 / 18 | epoch 11 | time: 3547.61s | valid loss 1.32 | valid ppl 3.74 | learning rate 20.0000 | end of split 16 / 18 | epoch 11 | time: 3564.68s | valid loss 1.32 | valid ppl 3.74 | learning rate 20.0000 | end of split 17 / 18 | epoch 11 | time: 3604.26s | valid loss 1.32 | valid ppl 3.74 | learning rate 20.0000 | end of split 18 / 18 | epoch 11 | time: 3613.34s | valid loss 1.32 | valid ppl 3.73 | learning rate 20.0000 | end of split 1 / 18 | epoch 12 | time: 3620.89s | valid loss 1.32 | valid ppl 3.74 | learning rate 20.0000 | end of split 2 / 18 | epoch 12 | time: 3609.81s | valid loss 1.32 | valid ppl 3.74 | learning rate 20.0000 | end of split 3 / 18 | epoch 12 | time: 3616.07s | valid loss 1.32 | valid ppl 3.74 | learning rate 20.0000 | end of split 4 / 18 | epoch 12 | time: 3625.23s | valid loss 1.32 | valid ppl 3.74 | learning rate 20.0000 | end of split 5 / 18 | epoch 12 | time: 3602.91s | valid loss 1.32 | valid ppl 3.74 | learning rate 20.0000 | end of split 6 / 18 | epoch 12 | time: 3620.52s | valid loss 1.32 | valid ppl 3.75 | learning rate 20.0000 | end of split 7 / 18 | epoch 12 | time: 3377.07s | valid loss 1.32 | valid ppl 3.74 | learning rate 20.0000 | end of split 8 / 18 | epoch 12 | time: 3480.30s | valid loss 1.32 | valid ppl 3.73 | learning rate 20.0000 | end of split 9 / 18 | epoch 12 | time: 3516.73s | valid loss 1.32 | valid ppl 3.73 | learning rate 20.0000 | end of split 10 / 18 | epoch 12 | time: 3477.86s | valid loss 1.32 | valid ppl 3.73 | learning rate 20.0000 | end of split 11 / 18 | epoch 12 | time: 3480.07s | valid loss 1.32 | valid ppl 3.73 | learning rate 20.0000 | end of split 12 / 18 | epoch 12 | time: 2498.73s | valid loss 1.32 | valid ppl 3.73 | learning rate 20.0000 | end of split 13 / 18 | epoch 12 | time: 2697.01s | valid loss 1.32 | valid ppl 3.73 | learning rate 20.0000 | end of split 14 / 18 | epoch 12 | time: 2931.38s | valid loss 1.31 | valid ppl 3.72 | learning rate 20.0000 | end of split 15 / 18 | epoch 12 | time: 2956.95s | valid loss 1.31 | valid ppl 3.72 | learning rate 20.0000 | end of split 16 / 18 | epoch 12 | time: 2965.55s | valid loss 1.32 | valid ppl 3.73 | learning rate 20.0000 | end of split 17 / 18 | epoch 12 | time: 2929.03s | valid loss 1.31 | valid ppl 3.72 | learning rate 20.0000 | end of split 18 / 18 | epoch 12 | time: 2985.95s | valid loss 1.31 | valid ppl 3.72 | learning rate 20.0000 | end of split 1 / 18 | epoch 13 | time: 2993.72s | valid loss 1.32 | valid ppl 3.73 | learning rate 20.0000 | end of split 2 / 18 | epoch 13 | time: 2991.02s | valid loss 1.32 | valid ppl 3.74 | learning rate 20.0000 | end of split 3 / 18 | epoch 13 | time: 2995.79s | valid loss 1.32 | valid ppl 3.73 | learning rate 20.0000 | end of split 4 / 18 | epoch 13 | time: 3005.45s | valid loss 1.32 | valid ppl 3.73 | learning rate 20.0000 | end of split 5 / 18 | epoch 13 | time: 3015.18s | valid loss 1.32 | valid ppl 3.73 | learning rate 20.0000 | end of split 6 / 18 | epoch 13 | time: 2945.22s | valid loss 1.31 | valid ppl 3.72 | learning rate 20.0000 | end of split 7 / 18 | epoch 13 | time: 3745.65s | valid loss 1.31 | valid ppl 3.72 | learning rate 20.0000 | end of split 8 / 18 | epoch 13 | time: 3742.99s | valid loss 1.31 | valid ppl 3.72 | learning rate 20.0000 | end of split 9 / 18 | epoch 13 | time: 3740.68s | valid loss 1.31 | valid ppl 3.72 | learning rate 20.0000 | end of split 10 / 18 | epoch 13 | time: 3739.22s | valid loss 1.32 | valid ppl 3.73 | learning rate 20.0000 | end of split 11 / 18 | epoch 13 | time: 3750.67s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 12 / 18 | epoch 13 | time: 3730.19s | valid loss 1.31 | valid ppl 3.72 | learning rate 20.0000 | end of split 13 / 18 | epoch 13 | time: 3734.69s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 14 / 18 | epoch 13 | time: 3735.51s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 15 / 18 | epoch 13 | time: 3723.33s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 16 / 18 | epoch 13 | time: 3739.89s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 17 / 18 | epoch 13 | time: 3725.18s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 18 / 18 | epoch 13 | time: 3732.94s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 1 / 18 | epoch 14 | time: 3724.66s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 2 / 18 | epoch 14 | time: 3726.38s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 3 / 18 | epoch 14 | time: 3721.42s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 4 / 18 | epoch 14 | time: 3736.99s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 5 / 18 | epoch 14 | time: 3607.70s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 6 / 18 | epoch 14 | time: 3635.17s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 7 / 18 | epoch 14 | time: 3623.78s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 8 / 18 | epoch 14 | time: 3625.29s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 9 / 18 | epoch 14 | time: 3617.10s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 10 / 18 | epoch 14 | time: 3611.98s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 11 / 18 | epoch 14 | time: 3626.40s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 12 / 18 | epoch 14 | time: 3622.20s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 13 / 18 | epoch 14 | time: 3618.56s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 14 / 18 | epoch 14 | time: 3642.49s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 15 / 18 | epoch 14 | time: 3620.78s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 16 / 18 | epoch 14 | time: 3628.28s | valid loss 1.31 | valid ppl 3.71 | learning rate 20.0000 | end of split 17 / 18 | epoch 14 | time: 3610.75s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 18 / 18 | epoch 14 | time: 3603.11s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 1 / 18 | epoch 15 | time: 3178.00s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 2 / 18 | epoch 15 | time: 3126.23s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 3 / 18 | epoch 15 | time: 3377.80s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 4 / 18 | epoch 15 | time: 2485.93s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 5 / 18 | epoch 15 | time: 3333.79s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 6 / 18 | epoch 15 | time: 3532.75s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 7 / 18 | epoch 15 | time: 2776.07s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 8 / 18 | epoch 15 | time: 2532.20s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 9 / 18 | epoch 15 | time: 3682.80s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 10 / 18 | epoch 15 | time: 3697.28s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 11 / 18 | epoch 15 | time: 3687.54s | valid loss 1.31 | valid ppl 3.69 | learning rate 20.0000 | end of split 12 / 18 | epoch 15 | time: 3695.02s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 13 / 18 | epoch 15 | time: 3709.90s | valid loss 1.31 | valid ppl 3.69 | learning rate 20.0000 | end of split 14 / 18 | epoch 15 | time: 3695.11s | valid loss 1.31 | valid ppl 3.69 | learning rate 20.0000 | end of split 15 / 18 | epoch 15 | time: 3684.02s | valid loss 1.31 | valid ppl 3.69 | learning rate 20.0000 | end of split 16 / 18 | epoch 15 | time: 2973.71s | valid loss 1.31 | valid ppl 3.69 | learning rate 20.0000 | end of split 17 / 18 | epoch 15 | time: 2487.50s | valid loss 1.30 | valid ppl 3.69 | learning rate 20.0000 | end of split 18 / 18 | epoch 15 | time: 2483.52s | valid loss 1.31 | valid ppl 3.69 | learning rate 20.0000 | end of split 1 / 18 | epoch 16 | time: 2475.94s | valid loss 1.31 | valid ppl 3.69 | learning rate 20.0000 | end of split 2 / 18 | epoch 16 | time: 2484.35s | valid loss 1.31 | valid ppl 3.69 | learning rate 20.0000 | end of split 3 / 18 | epoch 16 | time: 3618.32s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 4 / 18 | epoch 16 | time: 3039.25s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 5 / 18 | epoch 16 | time: 2488.02s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 6 / 18 | epoch 16 | time: 3273.68s | valid loss 1.31 | valid ppl 3.69 | learning rate 20.0000 | end of split 7 / 18 | epoch 16 | time: 3695.06s | valid loss 1.31 | valid ppl 3.69 | learning rate 20.0000 | end of split 8 / 18 | epoch 16 | time: 3696.75s | valid loss 1.31 | valid ppl 3.70 | learning rate 20.0000 | end of split 9 / 18 | epoch 16 | time: 3710.09s | valid loss 1.31 | valid ppl 3.69 | learning rate 20.0000 | end of split 10 / 18 | epoch 16 | time: 2791.68s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 11 / 18 | epoch 16 | time: 2901.62s | valid loss 1.30 | valid ppl 3.69 | learning rate 20.0000 | end of split 12 / 18 | epoch 16 | time: 2905.90s | valid loss 1.30 | valid ppl 3.69 | learning rate 20.0000 | end of split 13 / 18 | epoch 16 | time: 2914.46s | valid loss 1.30 | valid ppl 3.69 | learning rate 20.0000 | end of split 14 / 18 | epoch 16 | time: 2923.73s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 15 / 18 | epoch 16 | time: 2924.75s | valid loss 1.31 | valid ppl 3.69 | learning rate 20.0000 | end of split 16 / 18 | epoch 16 | time: 2916.50s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 17 / 18 | epoch 16 | time: 2931.61s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 18 / 18 | epoch 16 | time: 2746.44s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 1 / 18 | epoch 17 | time: 2481.27s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 2 / 18 | epoch 17 | time: 2488.12s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 3 / 18 | epoch 17 | time: 2492.83s | valid loss 1.30 | valid ppl 3.69 | learning rate 20.0000 | end of split 4 / 18 | epoch 17 | time: 2499.47s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 5 / 18 | epoch 17 | time: 2492.24s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 6 / 18 | epoch 17 | time: 2491.89s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 7 / 18 | epoch 17 | time: 2490.53s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 8 / 18 | epoch 17 | time: 2495.56s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 9 / 18 | epoch 17 | time: 2487.49s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 10 / 18 | epoch 17 | time: 2485.70s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 11 / 18 | epoch 17 | time: 2483.89s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 12 / 18 | epoch 17 | time: 2497.17s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 13 / 18 | epoch 17 | time: 2488.51s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 14 / 18 | epoch 17 | time: 2490.46s | valid loss 1.30 | valid ppl 3.67 | learning rate 20.0000 | end of split 15 / 18 | epoch 17 | time: 2838.22s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 16 / 18 | epoch 17 | time: 4061.44s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 17 / 18 | epoch 17 | time: 4062.73s | valid loss 1.30 | valid ppl 3.68 | learning rate 20.0000 | end of split 18 / 18 | epoch 17 | time: 3064.15s | valid loss 1.30 | valid ppl 3.67 | learning rate 20.0000 | end of split 1 / 18 | epoch 18 | time: 2482.12s | valid loss 1.30 | valid ppl 3.67 | learning rate 20.0000 | end of split 2 / 18 | epoch 18 | time: 2496.74s | valid loss 1.30 | valid ppl 3.67 | learning rate 20.0000 | end of split 3 / 18 | epoch 18 | time: 2493.64s | valid loss 1.30 | valid ppl 3.67 | learning rate 20.0000 | end of split 4 / 18 | epoch 18 | time: 2489.04s | valid loss 1.30 | valid ppl 3.67 | learning rate 20.0000 | end of split 5 / 18 | epoch 18 | time: 2490.58s | valid loss 1.30 | valid ppl 3.67 | learning rate 20.0000 | end of split 6 / 18 | epoch 18 | time: 2491.90s | valid loss 1.30 | valid ppl 3.67 | learning rate 20.0000 | end of split 7 / 18 | epoch 18 | time: 2492.77s | valid loss 1.30 | valid ppl 3.67 | learning rate 20.0000 | end of split 8 / 18 | epoch 18 | time: 2491.70s | valid loss 1.30 | valid ppl 3.67 | learning rate 20.0000 | end of split 9 / 18 | epoch 18 | time: 2494.50s | valid loss 1.30 | valid ppl 3.67 | learning rate 20.0000 | end of split 10 / 18 | epoch 18 | time: 2497.09s | valid loss 1.30 | valid ppl 3.67 | learning rate 20.0000 | end of split 11 / 18 | epoch 18 | time: 2488.38s | valid loss 1.30 | valid ppl 3.67 | learning rate 20.0000 | end of split 12 / 18 | epoch 18 | time: 2494.79s | valid loss 1.29 | valid ppl 3.64 | learning rate 5.0000 | end of split 13 / 18 | epoch 18 | time: 2496.84s | valid loss 1.29 | valid ppl 3.63 | learning rate 5.0000 | end of split 14 / 18 | epoch 18 | time: 2495.39s | valid loss 1.29 | valid ppl 3.63 | learning rate 5.0000 | end of split 15 / 18 | epoch 18 | time: 2671.21s | valid loss 1.29 | valid ppl 3.63 | learning rate 5.0000 | end of split 16 / 18 | epoch 18 | time: 3099.72s | valid loss 1.29 | valid ppl 3.62 | learning rate 5.0000 | end of split 17 / 18 | epoch 18 | time: 3131.84s | valid loss 1.29 | valid ppl 3.62 | learning rate 5.0000 | end of split 18 / 18 | epoch 18 | time: 2731.71s | valid loss 1.29 | valid ppl 3.62 | learning rate 5.0000 | end of split 1 / 18 | epoch 19 | time: 2487.40s | valid loss 1.29 | valid ppl 3.62 | learning rate 5.0000 | end of split 2 / 18 | epoch 19 | time: 2498.06s | valid loss 1.29 | valid ppl 3.62 | learning rate 5.0000 | end of split 3 / 18 | epoch 19 | time: 2500.97s | valid loss 1.29 | valid ppl 3.62 | learning rate 5.0000 | end of split 4 / 18 | epoch 19 | time: 2495.67s | valid loss 1.29 | valid ppl 3.62 | learning rate 5.0000 | end of split 5 / 18 | epoch 19 | time: 2498.40s | valid loss 1.29 | valid ppl 3.62 | learning rate 5.0000 | end of split 6 / 18 | epoch 19 | time: 2494.53s | valid loss 1.29 | valid ppl 3.62 | learning rate 5.0000 | end of split 7 / 18 | epoch 19 | time: 2490.57s | valid loss 1.29 | valid ppl 3.62 | learning rate 5.0000 | end of split 8 / 18 | epoch 19 | time: 2488.31s | valid loss 1.29 | valid ppl 3.61 | learning rate 5.0000 | end of split 9 / 18 | epoch 19 | time: 2493.12s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 10 / 18 | epoch 19 | time: 2489.25s | valid loss 1.29 | valid ppl 3.62 | learning rate 5.0000 | end of split 11 / 18 | epoch 19 | time: 2489.56s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 12 / 18 | epoch 19 | time: 2676.81s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 13 / 18 | epoch 19 | time: 2493.96s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 14 / 18 | epoch 19 | time: 2491.60s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 15 / 18 | epoch 19 | time: 2497.32s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 16 / 18 | epoch 19 | time: 2494.49s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 17 / 18 | epoch 19 | time: 2498.37s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 18 / 18 | epoch 19 | time: 2499.59s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 1 / 18 | epoch 20 | time: 2491.60s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 2 / 18 | epoch 20 | time: 2491.91s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 3 / 18 | epoch 20 | time: 2491.73s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 4 / 18 | epoch 20 | time: 2491.98s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 5 / 18 | epoch 20 | time: 2489.99s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 6 / 18 | epoch 20 | time: 2494.63s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 7 / 18 | epoch 20 | time: 2487.81s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 8 / 18 | epoch 20 | time: 2493.25s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 9 / 18 | epoch 20 | time: 2503.02s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 10 / 18 | epoch 20 | time: 2501.88s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 11 / 18 | epoch 20 | time: 2495.13s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 12 / 18 | epoch 20 | time: 2496.21s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 13 / 18 | epoch 20 | time: 2485.99s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 14 / 18 | epoch 20 | time: 2490.28s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 15 / 18 | epoch 20 | time: 2489.33s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 16 / 18 | epoch 20 | time: 2486.12s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 17 / 18 | epoch 20 | time: 2489.06s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 18 / 18 | epoch 20 | time: 2486.53s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 1 / 18 | epoch 21 | time: 2479.47s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 2 / 18 | epoch 21 | time: 2491.41s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 3 / 18 | epoch 21 | time: 2487.89s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 4 / 18 | epoch 21 | time: 2488.36s | valid loss 1.28 | valid ppl 3.61 | learning rate 5.0000 | end of split 5 / 18 | epoch 21 | time: 2484.84s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 6 / 18 | epoch 21 | time: 2484.87s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 7 / 18 | epoch 21 | time: 2485.70s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 8 / 18 | epoch 21 | time: 2486.34s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 9 / 18 | epoch 21 | time: 2484.71s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 10 / 18 | epoch 21 | time: 2489.89s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 11 / 18 | epoch 21 | time: 2485.14s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 12 / 18 | epoch 21 | time: 2488.31s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 13 / 18 | epoch 21 | time: 2490.74s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 14 / 18 | epoch 21 | time: 2480.70s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 15 / 18 | epoch 21 | time: 2490.57s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 16 / 18 | epoch 21 | time: 2494.15s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 17 / 18 | epoch 21 | time: 2494.83s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 18 / 18 | epoch 21 | time: 2493.15s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 1 / 18 | epoch 22 | time: 2487.01s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 2 / 18 | epoch 22 | time: 2489.43s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 3 / 18 | epoch 22 | time: 2490.94s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 4 / 18 | epoch 22 | time: 2482.99s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 5 / 18 | epoch 22 | time: 2483.99s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 6 / 18 | epoch 22 | time: 2479.27s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 7 / 18 | epoch 22 | time: 2479.83s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 8 / 18 | epoch 22 | time: 2484.45s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 9 / 18 | epoch 22 | time: 2490.49s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 10 / 18 | epoch 22 | time: 2485.94s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 11 / 18 | epoch 22 | time: 2489.71s | valid loss 1.28 | valid ppl 3.60 | learning rate 5.0000 | end of split 12 / 18 | epoch 22 | time: 2496.60s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 13 / 18 | epoch 22 | time: 2487.29s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 14 / 18 | epoch 22 | time: 2491.95s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 15 / 18 | epoch 22 | time: 2487.56s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 16 / 18 | epoch 22 | time: 2483.13s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 17 / 18 | epoch 22 | time: 2486.02s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 18 / 18 | epoch 22 | time: 2490.44s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 1 / 18 | epoch 23 | time: 2486.98s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 2 / 18 | epoch 23 | time: 2486.59s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 3 / 18 | epoch 23 | time: 2486.58s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 4 / 18 | epoch 23 | time: 2487.18s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 5 / 18 | epoch 23 | time: 2487.57s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 6 / 18 | epoch 23 | time: 2481.04s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 7 / 18 | epoch 23 | time: 2485.45s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 8 / 18 | epoch 23 | time: 2481.69s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 9 / 18 | epoch 23 | time: 2494.66s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 10 / 18 | epoch 23 | time: 2487.82s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 11 / 18 | epoch 23 | time: 2489.43s | valid loss 1.28 | valid ppl 3.58 | learning rate 1.2500 | end of split 12 / 18 | epoch 23 | time: 2480.01s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 13 / 18 | epoch 23 | time: 2482.74s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 14 / 18 | epoch 23 | time: 2481.26s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 15 / 18 | epoch 23 | time: 2482.85s | valid loss 1.28 | valid ppl 3.58 | learning rate 1.2500 | end of split 16 / 18 | epoch 23 | time: 2486.67s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 17 / 18 | epoch 23 | time: 2494.01s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 18 / 18 | epoch 23 | time: 2494.31s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 1 / 18 | epoch 24 | time: 2485.30s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 2 / 18 | epoch 24 | time: 2488.27s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 3 / 18 | epoch 24 | time: 2492.57s | valid loss 1.28 | valid ppl 3.59 | learning rate 1.2500 | end of split 4 / 18 | epoch 24 | time: 2484.67s | valid loss 1.28 | valid ppl 3.58 | learning rate 1.2500 | end of split 5 / 18 | epoch 24 | time: 2485.16s | valid loss 1.28 | valid ppl 3.59 | learning rate 0.3125 | end of split 6 / 18 | epoch 24 | time: 2487.96s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.3125 | end of split 7 / 18 | epoch 24 | time: 2487.52s | valid loss 1.28 | valid ppl 3.59 | learning rate 0.3125 | end of split 8 / 18 | epoch 24 | time: 2487.27s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.3125 | end of split 9 / 18 | epoch 24 | time: 2493.05s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.3125 | end of split 10 / 18 | epoch 24 | time: 2492.03s | valid loss 1.28 | valid ppl 3.59 | learning rate 0.3125 | end of split 11 / 18 | epoch 24 | time: 2487.59s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.3125 | end of split 12 / 18 | epoch 24 | time: 2484.06s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.3125 | end of split 13 / 18 | epoch 24 | time: 2483.59s | valid loss 1.28 | valid ppl 3.59 | learning rate 0.3125 | end of split 14 / 18 | epoch 24 | time: 2486.81s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.3125 | end of split 15 / 18 | epoch 24 | time: 2490.48s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.3125 | end of split 16 / 18 | epoch 24 | time: 2485.97s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.3125 | end of split 17 / 18 | epoch 24 | time: 2484.68s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.3125 | end of split 18 / 18 | epoch 24 | time: 2487.52s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.3125 | end of split 1 / 18 | epoch 25 | time: 2484.99s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.3125 | end of split 2 / 18 | epoch 25 | time: 2483.16s | valid loss 1.28 | valid ppl 3.59 | learning rate 0.3125 | end of split 3 / 18 | epoch 25 | time: 2485.26s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.3125 | end of split 4 / 18 | epoch 25 | time: 2489.98s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.3125 | end of split 5 / 18 | epoch 25 | time: 2485.02s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.3125 | end of split 6 / 18 | epoch 25 | time: 2482.43s | valid loss 1.28 | valid ppl 3.59 | learning rate 0.3125 | end of split 7 / 18 | epoch 25 | time: 2486.28s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.3125 | end of split 8 / 18 | epoch 25 | time: 2487.26s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.3125 | end of split 9 / 18 | epoch 25 | time: 2495.16s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0781 | end of split 10 / 18 | epoch 25 | time: 2481.23s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0781 | end of split 11 / 18 | epoch 25 | time: 2488.33s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0781 | end of split 12 / 18 | epoch 25 | time: 2480.41s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0781 | end of split 13 / 18 | epoch 25 | time: 2482.56s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0781 | end of split 14 / 18 | epoch 25 | time: 2479.94s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0781 | end of split 15 / 18 | epoch 25 | time: 2486.20s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0781 | end of split 16 / 18 | epoch 25 | time: 2489.97s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0781 | end of split 17 / 18 | epoch 25 | time: 2490.12s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0781 | end of split 18 / 18 | epoch 25 | time: 2478.76s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0781 | end of split 1 / 18 | epoch 26 | time: 2481.22s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0781 | end of split 2 / 18 | epoch 26 | time: 2485.87s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0195 | end of split 3 / 18 | epoch 26 | time: 2490.39s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0195 | end of split 4 / 18 | epoch 26 | time: 2486.90s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0195 | end of split 5 / 18 | epoch 26 | time: 2486.80s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0195 | end of split 6 / 18 | epoch 26 | time: 2489.96s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0195 | end of split 7 / 18 | epoch 26 | time: 2484.81s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0195 | end of split 8 / 18 | epoch 26 | time: 2479.80s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0195 | end of split 9 / 18 | epoch 26 | time: 2486.40s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0195 | end of split 10 / 18 | epoch 26 | time: 2484.52s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0195 | end of split 11 / 18 | epoch 26 | time: 2492.39s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0195 | end of split 12 / 18 | epoch 26 | time: 2482.15s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0195 | end of split 13 / 18 | epoch 26 | time: 2484.67s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0049 | end of split 14 / 18 | epoch 26 | time: 2488.30s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0049 | end of split 15 / 18 | epoch 26 | time: 2491.87s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0049 | end of split 16 / 18 | epoch 26 | time: 2487.62s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0049 | end of split 17 / 18 | epoch 26 | time: 2482.83s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0049 | end of split 18 / 18 | epoch 26 | time: 2483.53s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0049 | end of split 1 / 18 | epoch 27 | time: 2485.59s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0049 | end of split 2 / 18 | epoch 27 | time: 2482.49s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0049 | end of split 3 / 18 | epoch 27 | time: 2484.19s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0049 | end of split 4 / 18 | epoch 27 | time: 2483.62s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0049 | end of split 5 / 18 | epoch 27 | time: 2481.52s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0049 | end of split 6 / 18 | epoch 27 | time: 2486.91s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0012 | end of split 7 / 18 | epoch 27 | time: 2488.10s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0012 | end of split 8 / 18 | epoch 27 | time: 2496.28s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0012 | end of split 9 / 18 | epoch 27 | time: 2490.70s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0012 | end of split 10 / 18 | epoch 27 | time: 2492.99s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0012 | end of split 11 / 18 | epoch 27 | time: 2483.10s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0012 | end of split 12 / 18 | epoch 27 | time: 2483.66s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0012 | end of split 13 / 18 | epoch 27 | time: 2490.79s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0012 | end of split 14 / 18 | epoch 27 | time: 2487.86s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0012 | end of split 15 / 18 | epoch 27 | time: 2485.85s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0012 | end of split 16 / 18 | epoch 27 | time: 2488.64s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0012 | end of split 17 / 18 | epoch 27 | time: 2487.80s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0003 | end of split 18 / 18 | epoch 27 | time: 2485.42s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0003 | end of split 1 / 18 | epoch 28 | time: 2483.25s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0003 | end of split 2 / 18 | epoch 28 | time: 2480.45s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0003 | end of split 3 / 18 | epoch 28 | time: 2480.73s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0003 | end of split 4 / 18 | epoch 28 | time: 2485.69s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0003 | end of split 5 / 18 | epoch 28 | time: 2481.92s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0003 | end of split 6 / 18 | epoch 28 | time: 2484.90s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0003 | end of split 7 / 18 | epoch 28 | time: 2481.64s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0003 | end of split 8 / 18 | epoch 28 | time: 2485.23s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0003 | end of split 9 / 18 | epoch 28 | time: 2483.92s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0003 | end of split 10 / 18 | epoch 28 | time: 2492.13s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0001 | end of split 11 / 18 | epoch 28 | time: 2488.06s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0001 | end of split 12 / 18 | epoch 28 | time: 2492.61s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0001 | end of split 13 / 18 | epoch 28 | time: 2488.63s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0001 | end of split 14 / 18 | epoch 28 | time: 2485.62s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0001 | end of split 15 / 18 | epoch 28 | time: 2486.51s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0001 | end of split 16 / 18 | epoch 28 | time: 2481.53s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0001 | end of split 17 / 18 | epoch 28 | time: 2487.44s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0001 | end of split 18 / 18 | epoch 28 | time: 2489.20s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0001 | end of split 1 / 18 | epoch 29 | time: 2490.54s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0001 | end of split 2 / 18 | epoch 29 | time: 2488.88s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0001 | end of split 3 / 18 | epoch 29 | time: 2486.03s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 29 | time: 2495.41s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 29 | time: 2484.94s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 29 | time: 2486.24s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 29 | time: 2483.41s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 29 | time: 2488.29s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 29 | time: 2487.39s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 29 | time: 2481.38s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 29 | time: 2481.69s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 29 | time: 2481.80s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 29 | time: 2478.56s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 29 | time: 2480.86s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 29 | time: 2482.26s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 29 | time: 2489.29s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 29 | time: 2491.86s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 29 | time: 2480.81s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 30 | time: 2480.73s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 30 | time: 2485.78s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 30 | time: 2498.76s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 30 | time: 2485.68s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 30 | time: 2492.11s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 30 | time: 2497.13s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 30 | time: 2492.74s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 30 | time: 2489.82s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 30 | time: 2484.79s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 30 | time: 2484.13s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 30 | time: 2484.90s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 30 | time: 2488.03s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 30 | time: 2492.41s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 30 | time: 2491.32s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 30 | time: 2494.99s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 30 | time: 2490.77s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 30 | time: 2487.54s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 30 | time: 2488.03s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 31 | time: 2482.83s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 31 | time: 2484.93s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 31 | time: 2488.03s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 31 | time: 2490.33s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 31 | time: 2491.62s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 31 | time: 2495.74s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 31 | time: 2487.73s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 31 | time: 2491.90s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 31 | time: 2487.72s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 31 | time: 2486.54s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 31 | time: 2493.62s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 31 | time: 2486.31s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 31 | time: 2492.31s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 31 | time: 2496.98s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 31 | time: 2485.17s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 31 | time: 2489.59s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 31 | time: 2494.87s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 31 | time: 2488.62s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 32 | time: 2484.63s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 32 | time: 2491.97s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 32 | time: 2486.45s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 32 | time: 2489.01s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 32 | time: 2490.60s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 32 | time: 2491.32s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 32 | time: 2489.58s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 32 | time: 2490.86s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 32 | time: 2489.38s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 32 | time: 2494.44s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 32 | time: 2495.95s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 32 | time: 2488.36s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 32 | time: 2490.17s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 32 | time: 2501.00s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 32 | time: 2491.94s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 32 | time: 2493.53s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 32 | time: 2492.75s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 32 | time: 2496.95s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 33 | time: 2488.77s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 33 | time: 2491.49s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 33 | time: 2490.28s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 33 | time: 2492.53s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 33 | time: 2494.70s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 33 | time: 2486.99s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 33 | time: 2493.42s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 33 | time: 2497.30s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 33 | time: 2497.76s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 33 | time: 2495.23s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 33 | time: 2496.17s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 33 | time: 2487.28s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 33 | time: 2491.34s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 33 | time: 2482.16s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 33 | time: 2486.97s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 33 | time: 2489.84s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 33 | time: 2486.26s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 33 | time: 2493.41s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 34 | time: 2484.04s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 34 | time: 2489.91s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 34 | time: 2491.22s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 34 | time: 2486.00s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 34 | time: 2491.21s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 34 | time: 2494.85s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 34 | time: 2487.00s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 34 | time: 2488.30s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 34 | time: 2484.42s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 34 | time: 2491.22s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 34 | time: 2495.70s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 34 | time: 2488.12s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 34 | time: 2491.93s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 34 | time: 2498.36s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 34 | time: 2493.08s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 34 | time: 2495.55s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 34 | time: 2491.23s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 34 | time: 2493.03s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 35 | time: 2492.38s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 35 | time: 2489.15s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 35 | time: 2486.27s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 35 | time: 2491.24s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 35 | time: 2492.50s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 35 | time: 2488.94s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 35 | time: 2485.18s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 35 | time: 2490.51s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 35 | time: 2486.20s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 35 | time: 2487.25s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 35 | time: 2494.92s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 35 | time: 2489.80s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 35 | time: 2482.69s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 35 | time: 2485.94s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 35 | time: 3543.35s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 35 | time: 2526.01s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 35 | time: 2488.92s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 35 | time: 2496.17s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 36 | time: 2487.98s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 36 | time: 2488.53s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 36 | time: 2492.59s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 36 | time: 2489.35s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 36 | time: 2485.48s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 36 | time: 2483.35s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 36 | time: 2484.00s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 36 | time: 2480.58s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 36 | time: 2481.07s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 36 | time: 2481.67s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 36 | time: 2478.38s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 36 | time: 2482.79s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 36 | time: 2490.57s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 36 | time: 2487.34s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 36 | time: 2482.44s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 36 | time: 2489.27s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 36 | time: 2484.40s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 36 | time: 2491.81s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 37 | time: 2484.56s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 37 | time: 2486.72s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 37 | time: 2490.28s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 37 | time: 2486.27s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 37 | time: 2484.70s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 37 | time: 2488.24s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 37 | time: 2497.85s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 37 | time: 2486.54s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 37 | time: 2486.16s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 37 | time: 2492.38s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 37 | time: 2492.59s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 37 | time: 2490.63s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 37 | time: 2485.74s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 37 | time: 2494.16s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 37 | time: 2491.36s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 37 | time: 2492.12s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 37 | time: 2485.71s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 37 | time: 2487.40s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 38 | time: 2482.97s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 38 | time: 2494.54s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 38 | time: 2486.12s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 38 | time: 2490.76s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 38 | time: 2492.22s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 38 | time: 2497.54s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 38 | time: 2492.74s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 38 | time: 2486.81s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 38 | time: 2494.09s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 38 | time: 2493.55s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 38 | time: 2490.35s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 38 | time: 2490.34s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 38 | time: 2497.56s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 38 | time: 2486.48s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 38 | time: 2487.25s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 38 | time: 2490.66s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 38 | time: 2484.30s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 38 | time: 2489.82s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 39 | time: 2484.44s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 39 | time: 2487.44s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 39 | time: 2485.63s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 39 | time: 2498.27s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 39 | time: 2493.45s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 39 | time: 2486.98s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 39 | time: 2487.27s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 39 | time: 2486.76s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 39 | time: 2488.80s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 39 | time: 2487.23s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 39 | time: 2496.09s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 39 | time: 2486.62s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 39 | time: 2489.30s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 39 | time: 2490.66s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 39 | time: 2492.30s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 39 | time: 2486.25s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 39 | time: 2490.34s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 39 | time: 2490.07s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 40 | time: 2486.35s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 40 | time: 2489.97s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 40 | time: 2490.53s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 40 | time: 2494.20s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 40 | time: 2500.94s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 40 | time: 2493.60s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 40 | time: 2490.72s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 40 | time: 2486.23s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 40 | time: 2497.36s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 40 | time: 2488.71s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 40 | time: 2487.25s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 40 | time: 2490.85s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 40 | time: 2485.72s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 40 | time: 2485.74s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 40 | time: 2496.72s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 40 | time: 2492.44s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 40 | time: 2492.05s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 40 | time: 2488.57s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 41 | time: 2494.09s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 41 | time: 2494.44s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 41 | time: 2485.98s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 41 | time: 2485.69s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 41 | time: 2488.04s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 41 | time: 2490.00s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 41 | time: 2489.14s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 41 | time: 2489.79s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 41 | time: 2493.42s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 41 | time: 2489.63s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 41 | time: 2488.83s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 41 | time: 2491.79s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 41 | time: 2485.70s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 41 | time: 2489.58s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 41 | time: 2492.88s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 41 | time: 2496.14s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 41 | time: 2493.04s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 41 | time: 2487.20s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 42 | time: 2494.96s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 42 | time: 2483.11s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 42 | time: 2487.70s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 42 | time: 2481.93s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 42 | time: 2489.48s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 42 | time: 2486.86s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 42 | time: 2487.99s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 42 | time: 2490.84s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 42 | time: 2492.16s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 42 | time: 2491.42s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 42 | time: 2491.01s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 42 | time: 2488.03s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 42 | time: 2483.89s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 42 | time: 2485.98s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 42 | time: 2485.74s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 42 | time: 2486.56s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 42 | time: 2483.68s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 42 | time: 2490.97s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 43 | time: 2487.61s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 43 | time: 2494.40s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 43 | time: 2480.55s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 43 | time: 2485.74s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 43 | time: 2490.06s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 43 | time: 2487.51s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 43 | time: 2482.61s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 43 | time: 2481.11s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 43 | time: 2484.77s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 43 | time: 2489.46s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 43 | time: 2491.24s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 43 | time: 2487.06s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 43 | time: 2489.92s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 43 | time: 2483.73s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 43 | time: 2495.92s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 43 | time: 2491.34s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 43 | time: 2489.81s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 43 | time: 2493.06s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 44 | time: 2488.04s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 44 | time: 2489.02s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 44 | time: 2486.40s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 44 | time: 2496.53s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 44 | time: 2489.12s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 44 | time: 2488.39s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 44 | time: 2492.69s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 44 | time: 2484.69s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 44 | time: 2482.78s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 44 | time: 2480.28s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 44 | time: 2481.11s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 44 | time: 2492.30s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 44 | time: 2481.51s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 44 | time: 2486.66s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 44 | time: 2486.38s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 44 | time: 2483.46s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 44 | time: 2482.45s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 44 | time: 2479.86s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 45 | time: 2482.71s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 45 | time: 2486.21s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 45 | time: 2477.77s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 45 | time: 2486.30s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 45 | time: 2490.29s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 45 | time: 2488.12s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 45 | time: 2483.85s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 45 | time: 2481.34s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 45 | time: 2479.69s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 45 | time: 2481.42s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 45 | time: 2483.47s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 45 | time: 2491.51s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 45 | time: 2485.45s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 45 | time: 2486.37s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 45 | time: 2494.94s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 45 | time: 2493.29s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 45 | time: 2491.10s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 45 | time: 2486.47s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 46 | time: 2481.69s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 46 | time: 2486.05s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 46 | time: 2489.50s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 46 | time: 2486.56s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 46 | time: 2488.82s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 46 | time: 2486.61s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 46 | time: 2485.23s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 46 | time: 2488.91s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 46 | time: 2484.93s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 46 | time: 2491.65s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 46 | time: 2493.02s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 46 | time: 2483.86s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 46 | time: 2481.04s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 46 | time: 2491.09s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 46 | time: 2490.24s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 46 | time: 2489.00s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 46 | time: 2490.91s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 46 | time: 2492.48s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 47 | time: 2484.54s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 47 | time: 2489.99s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 47 | time: 2483.27s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 47 | time: 2486.87s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 47 | time: 2485.56s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 47 | time: 2492.41s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 47 | time: 2490.50s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 47 | time: 2485.32s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 47 | time: 2487.46s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 47 | time: 2491.02s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 47 | time: 2480.59s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 47 | time: 2483.39s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 47 | time: 2487.87s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 47 | time: 2482.36s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 47 | time: 2485.25s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 47 | time: 2486.37s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 47 | time: 2485.37s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 47 | time: 2487.69s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 48 | time: 2478.55s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 48 | time: 2485.12s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 48 | time: 2492.34s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 48 | time: 2489.43s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 48 | time: 2494.66s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 48 | time: 2491.96s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 48 | time: 2491.16s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 48 | time: 2488.69s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 48 | time: 2488.02s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 48 | time: 2485.05s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 48 | time: 2488.55s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 48 | time: 2491.15s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 48 | time: 2487.37s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 48 | time: 2486.10s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 48 | time: 2494.21s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 48 | time: 2487.36s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 48 | time: 2486.25s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 48 | time: 2491.01s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 49 | time: 2489.48s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 49 | time: 2486.43s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 49 | time: 2486.08s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 49 | time: 2485.70s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 49 | time: 2491.36s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 49 | time: 2497.00s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 49 | time: 2488.83s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 49 | time: 2487.12s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 49 | time: 2490.43s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 49 | time: 2490.33s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 49 | time: 2480.83s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 49 | time: 2488.08s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 49 | time: 2488.49s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 49 | time: 2491.87s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 49 | time: 2494.02s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 49 | time: 2490.30s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 49 | time: 2492.00s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 49 | time: 2490.93s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 1 / 18 | epoch 50 | time: 2481.73s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 2 / 18 | epoch 50 | time: 2491.09s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 3 / 18 | epoch 50 | time: 2489.68s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 4 / 18 | epoch 50 | time: 2491.46s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 5 / 18 | epoch 50 | time: 2492.64s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 6 / 18 | epoch 50 | time: 2491.51s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 7 / 18 | epoch 50 | time: 2492.43s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 8 / 18 | epoch 50 | time: 2488.95s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 9 / 18 | epoch 50 | time: 2497.96s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 10 / 18 | epoch 50 | time: 2486.51s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 11 / 18 | epoch 50 | time: 2489.82s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 12 / 18 | epoch 50 | time: 2494.42s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 13 / 18 | epoch 50 | time: 2481.22s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 14 / 18 | epoch 50 | time: 2487.14s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 15 / 18 | epoch 50 | time: 2484.13s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 16 / 18 | epoch 50 | time: 2482.83s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 17 / 18 | epoch 50 | time: 2492.86s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 | end of split 18 / 18 | epoch 50 | time: 2484.74s | valid loss 1.28 | valid ppl 3.58 | learning rate 0.0000 TEST: valid loss 1.28 | valid ppl 3.58