svhn数据集处理
来源:互联网 发布:nginx日志分析 可视化 编辑:程序博客网 时间:2024/06/06 02:35
https://github.com/hangyao/street_view_house_numbers 这个是SVHN在tensorflow上的实现,是ipython版本的
使用上面链接里的方法,在ipython中改成三个集合,分别是trainset,validset和testset
然后再保存成mat格式,再用之前找到的mat转lmdb的代码执行,就ok了。
里面有两个需要注意的,一是数据X不需要做维度的变化,二是标签y是n*1的uint8格式,没有改成n*1的double型
没有做local contrast normalization.
根据新的trainset重新计算了均值。
把ipython中的源码贴下:extract_trainset_and_validset.ipython
from __future__ import print_function
import matplotlib.pyplot as plt
import numpy as np
import os
import sys
import tarfile
#import tensorflow as tf
from IPython.display import display, Image
from scipy import ndimage
from sklearn.linear_model import SGDClassifier
from sklearn.metrics import classification_report, confusion_matrix
from six.moves.urllib.request import urlretrieve
from six.moves import cPickle as pickle
import scipy.io
train_data = scipy.io.loadmat('train_32x32.mat', variable_names='X').get('X')
train_labels = scipy.io.loadmat('train_32x32.mat', variable_names='y').get('y')
test_data = scipy.io.loadmat('test_32x32.mat', variable_names='X').get('X')
test_labels = scipy.io.loadmat('test_32x32.mat', variable_names='y').get('y')
extra_data = scipy.io.loadmat('extra_32x32.mat', variable_names='X').get('X')
extra_labels = scipy.io.loadmat('extra_32x32.mat', variable_names='y').get('y')
print(train_data.shape, train_labels.shape)
print(test_data.shape, test_labels.shape)
print(extra_data.shape, extra_labels.shape)
train_labels[train_labels == 10] = 0
test_labels[test_labels == 10] = 0
extra_labels[extra_labels == 10] = 0
import random
random.seed()
n_labels = 10
valid_index = []
valid_index2 = []
train_index = []
train_index2 = []
for i in np.arange(n_labels):
valid_index.extend(np.where(train_labels[:,0] == (i))[0][:400].tolist())
train_index.extend(np.where(train_labels[:,0] == (i))[0][400:].tolist())
valid_index2.extend(np.where(extra_labels[:,0] == (i))[0][:200].tolist())
train_index2.extend(np.where(extra_labels[:,0] == (i))[0][200:].tolist())
random.shuffle(valid_index)
random.shuffle(train_index)
random.shuffle(valid_index2)
random.shuffle(train_index2)
#valid_data = np.concatenate((extra_data[:,:,:,valid_index2], train_data[:,:,:,valid_index]), axis=3).transpose((3,0,1,2))
valid_data = np.concatenate((extra_data[:,:,:,valid_index2], train_data[:,:,:,valid_index]), axis=3)
#valid_labels = np.concatenate((extra_labels[valid_index2,:], train_labels[valid_index,:]), axis=0)[:,0]
valid_labels = np.concatenate((extra_labels[valid_index2,:], train_labels[valid_index,:]), axis=0)
#train_data_t = np.concatenate((extra_data[:,:,:,train_index2], train_data[:,:,:,train_index]), axis=3).transpose((3,0,1,2))
train_data_t = np.concatenate((extra_data[:,:,:,train_index2], train_data[:,:,:,train_index]), axis=3)
#train_labels_t = np.concatenate((extra_labels[train_index2,:], train_labels[train_index,:]), axis=0)[:,0]
train_labels_t = np.concatenate((extra_labels[train_index2,:], train_labels[train_index,:]), axis=0)
#test_data = test_data.transpose((3,0,1,2))
test_data = test_data
test_labels = test_labels
print(train_data_t.shape, train_labels_t.shape)
print(test_data.shape, test_labels.shape)
print(valid_data.shape, valid_labels.shape)
scipy.io.savemat('train_32x32_1.mat',{'X': train_data_t,'y': train_labels_t})
scipy.io.savemat('valid_32x32.mat',{'X': valid_data,'y': valid_labels})
scipy.io.savemat('test_32x32_1.mat',{'X': test_data,'y': test_labels})
- svhn数据集处理
- SVHN 数据集
- 人工智能数据集描述——SVHN
- tensorflow读取SVHN数据集转为TFrecords格式
- tensorflow进行SVHN数据实验
- svhn的local contrast normalization处理
- 写个python脚本下载并解压 MNIST, CIFAR10, SVHN 数据集(2)
- libsvm处理数据集
- 数据集处理
- 数据集处理
- 数据集无损处理
- MNIST数据集处理
- jazz数据集处理
- python数据集处理
- 数据集处理
- caffe学习(3):SVHN on caffe
- DBLP实验数据集处理
- coco数据集的处理
- git撤销修改
- delphi 快捷键
- 算法学习笔记之计算几何--三角形,多边形与圆
- glog 入门教程
- 我们应该试着踏出第一步
- svhn数据集处理
- 服务器单独运行jar包方法
- leetcode 23. Merge k Sorted Lists
- Java中使用poi导入、导出Excel
- delphi进程监测
- 系统级性能分析工具 — Perf
- 【Linux】守护进程
- if(表达式),各种表达式的区别
- java list map 去重和排序方法